The first two chunks of this r markdown file after the r setup allow for plot zooming, but it also means that the html file must be opened in a browser to view the document properly. When it knits in RStudio the preview will appear empty but the html when opened in a browser will have all the info and you can click on each plot to Zoom in on it.

Before you begin

Notes

A few notes about this script.

If you are running this with the 2022-2023 data make sure you download the whole (OSM_2022-2023 GitHub repository)[https://github.com/ACMElabUvic/OSM_2022-2023] from the ACMElabUvic GitHub. This will ensure you have all the files, data, and proper folder structure you will need to run this code and associated analyses.

Also make sure you open RStudio through the R project (OSM_2022-2023.Rproj) this will automatically set your working directory to the correct place (wherever you saved the repository) and ensure you don’t have to change the file paths for some of the data.

Lastly, if you are looking to adapt this code for a future year of data, you will want to ensure you have run the ACME_camera_script_9-2-2024.R or .Rmd with your data as there is much data formatting, cleaning, and restructuring that has to be done before this code will work.

If you have question please email the most recent author, currently

Marissa A. Dyck
Postdoctoral research fellow
University of Victoria
School of Environmental Studies
Email: marissadyck17@gmail.com

Install packages

If you don’t already have the following packages installed, use the code below to install them.

install.packages('tidyverse') 
install.packages('ggpubr')
install.packages('corrplot')
install.packages('Hmisc')
install.packages('glmmTMB')
install.packages('MuMIn')
install.packages('TMB', type = 'source')
install.packages('rphylopic')

Load libraries

Then load the packages to your library.

library(tidyverse) # data tidying, visualization, and much more; this will load all tidyverse packages, can see complete list using tidyverse_packages()
library(ggpubr) # make modificaions to plot for publication (arrange plots)
library(PerformanceAnalytics)    #Used to generate a correlation plot
library(Hmisc) # used to generate histograms for all variables in data frame
library(glmmTMB)      #Constructing GLMMs
## Warning in checkMatrixPackageVersion(): Package version inconsistency detected.
## TMB was built with Matrix version 1.4.1
## Current Matrix version is 1.5.3
## Please re-install 'TMB' from source using install.packages('TMB', type = 'source') or ask CRAN for a binary version of 'TMB' matching CRAN's 'Matrix' package
library(MuMIn) # for model selection
library(rphylopic) # add animal silhouettes to graphs

Data

Load detection data

Read in saved and cleaned detection data from the ACME_camera_script_9-2-2024.R.

# detection data
# read in saved and cleaned detection data from the ACME_camera_script_9-2-2024.R 
detections <- read_csv('data/processed/OSM_2022_ind_det.csv') %>% 
  
  # change site, species and event_id to factor
  mutate_if(is.character,
            as.factor)
## Rows: 14102 Columns: 8
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (4): array, site, species, event_id
## dbl  (3): month, year, timediff
## dttm (1): datetime
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
str(detections)
## tibble [14,102 × 8] (S3: tbl_df/tbl/data.frame)
##  $ array   : Factor w/ 4 levels "LU01","LU13",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ site    : Factor w/ 155 levels "LU01_06","LU01_10",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ species : Factor w/ 39 levels "ATVer","Beaver",..: 31 31 38 38 38 38 38 38 38 38 ...
##  $ datetime: POSIXct[1:14102], format: "2022-06-17 10:01:52" "2023-09-10 12:51:15" ...
##  $ month   : num [1:14102] 6 9 6 7 7 7 8 8 8 8 ...
##  $ year    : num [1:14102] 2022 2023 2022 2022 2022 ...
##  $ timediff: num [1:14102] NA 648166 NA 31847 21429 ...
##  $ event_id: Factor w/ 14102 levels "E0","E1","E10",..: 1 2 5215 6326 7437 8548 9659 10770 11881 12992 ...

Data formatting

In order to get plots that have the same formatting as last years’ report we have to do a bit of data formatting. First we need to make sure we are including the same relevant species (some were ignored for last years’ report or grouped together)

Last years report had the following species

  • white-tailed deer
  • snowshoe hare
  • black bear
  • coyote
  • red squirrel
  • fisher
  • unknown
  • moose
  • lynx
  • spruce grouse
  • red fox
  • striped skunk
  • ruffed grouse
  • owl
  • grey wolf
  • domestic dog
  • cougar
  • raven
  • other
  • mule deer

And they grouped all humans except for staff as ‘Humans’. Let’s look at the species we have in this year’s data and try to format it the same way

detections %>% 
  
  # group by array and species
  group_by(species) %>% 
  summarise(n = n()) %>% 
  
  # have R print everything
  print(n = nrow(.))
## # A tibble: 39 × 2
##    species                 n
##    <fct>               <int>
##  1 ATVer                  32
##  2 Beaver                  3
##  3 Black bear           1419
##  4 Canada goose            4
##  5 Caribou                70
##  6 Cougar                  9
##  7 Coyote                990
##  8 Domestic dog            6
##  9 Fisher                187
## 10 Grey jay               50
## 11 Grey wolf             189
## 12 Human                   5
## 13 Hunter                  1
## 14 Long-tailed weasel     17
## 15 Lynx                  364
## 16 Marten                140
## 17 Moose                 617
## 18 Other                   2
## 19 Other birds           175
## 20 Otter                   7
## 21 Owl                    12
## 22 Porcupine               5
## 23 Raven                   6
## 24 Red fox               105
## 25 Red squirrel         1981
## 26 Ruffed grouse          43
## 27 Short-tailed weasel    21
## 28 Snowmobiler             7
## 29 Snowshoe hare        2911
## 30 Spruce grouse          77
## 31 Staff                 302
## 32 Striped skunk          40
## 33 Unknown               556
## 34 Unknown canid          70
## 35 Unknown deer          292
## 36 Unknown mustelid       55
## 37 Unknown ungulate       17
## 38 White-tailed deer    3307
## 39 Wolverine               8

Now let’s create a new data frame (tibble) to work with for the OSM figure summaries specifically

I personally would lump all the unknown together and all the birds together but for the sake of consistency with last year’s figures we will remove the same entries and keep the birds separate, let’s create a vector of entries to drop

species_drop <- c('Staff',
                  'Unknown deer',
                  'Unknown ungulate',
                  'Unknown canid',
                  'Unknown mustelid',
                  'Other birds')

# now we can create the new data frame with some changes consistent w/ choices made for 2021-2022
detections <- detections %>% 
  
  # for summarizing, lets lump all the recreational humans into "Humans"
  mutate(species = recode_factor(species, 
                                 "Snowmobiler" = "Human",
                                 "ATVer" = "Human",
                                 'Hunter' = 'Human')) %>% 
  
  # remove species we don't want to plot
  filter(!species %in% species_drop)

# look at data
str(detections)
## tibble [13,191 × 8] (S3: tbl_df/tbl/data.frame)
##  $ array   : Factor w/ 4 levels "LU01","LU13",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ site    : Factor w/ 155 levels "LU01_06","LU01_10",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ species : Factor w/ 36 levels "Human","Beaver",..: 35 35 35 35 35 35 35 35 35 35 ...
##  $ datetime: POSIXct[1:13191], format: "2022-06-18 11:09:19" "2022-07-10 13:56:10" ...
##  $ month   : num [1:13191] 6 7 7 7 8 8 8 8 10 10 ...
##  $ year    : num [1:13191] 2022 2022 2022 2022 2022 ...
##  $ timediff: num [1:13191] NA 31847 21429 8356 1627 ...
##  $ event_id: Factor w/ 14102 levels "E0","E1","E10",..: 5215 6326 7437 8548 9659 10770 11881 12992 3 1114 ...

We will also want to subset the data by landscape unit (LU) and generate a new data frame for each LU to use for plotting

# we will also want to create a data frame for each LU to plot individually

# LU1
dets_LU1 <- detections %>% 
  filter(array == 'LU01')

# LU13
dets_LU13 <- detections %>% 
  filter(array == 'LU13')

# LU15
dets_LU15 <- detections %>% 
  filter(array == 'LU15')

# LU21
dets_LU21 <- detections %>% 
  filter(array == 'LU21')

Marissa’s attempt at a for loop to subset data by array

We will also want to subset the data by landscape unit (LU) and generate a new data frame for each LU to use for plotting

Let’s see how this shit goes… probably bad but who knows

array_frames <- list()

for (i in unique(detections$array)){
  
   #Subset data based on radius
  df <- detections %>%
    filter(array == i)
  
 # list of dataframes
  array_frames <- c(array_frames, list(df))
  
 
}

# inspect one data frame
print(array_frames[[1]]) # this is for LU01
## # A tibble: 5,899 × 8
##    array site    species       datetime            month  year timediff event_id
##    <fct> <fct>   <fct>         <dttm>              <dbl> <dbl>    <dbl> <fct>   
##  1 LU01  LU01_06 White-tailed… 2022-06-18 11:09:19     6  2022      NA  E2      
##  2 LU01  LU01_06 White-tailed… 2022-07-10 13:56:10     7  2022   31847. E3      
##  3 LU01  LU01_06 White-tailed… 2022-07-25 11:04:44     7  2022   21429. E4      
##  4 LU01  LU01_06 White-tailed… 2022-07-31 06:38:06     7  2022    8356. E5      
##  5 LU01  LU01_06 White-tailed… 2022-08-01 09:45:28     8  2022    1627. E6      
##  6 LU01  LU01_06 White-tailed… 2022-08-01 15:51:01     8  2022     364. E7      
##  7 LU01  LU01_06 White-tailed… 2022-08-05 06:49:48     8  2022    5218. E8      
##  8 LU01  LU01_06 White-tailed… 2022-08-26 08:36:07     8  2022   30345. E9      
##  9 LU01  LU01_06 White-tailed… 2022-10-03 10:19:01    10  2022   54819. E10     
## 10 LU01  LU01_06 White-tailed… 2022-10-03 15:36:14    10  2022     317. E11     
## # ℹ 5,889 more rows

… I think this worked

Now let’s change names of list items using purrr, couldn’t figure out how to name them in the loop, you don’t necessarily need to do this because we change the names in the next section, but I like having things named

array_frames <- array_frames %>%

  purrr::set_names('LU01',
                   'LU13',
                   'LU15',
                   'LU21') # 

# inspect each data frame
print(array_frames$LU01)
## # A tibble: 5,899 × 8
##    array site    species       datetime            month  year timediff event_id
##    <fct> <fct>   <fct>         <dttm>              <dbl> <dbl>    <dbl> <fct>   
##  1 LU01  LU01_06 White-tailed… 2022-06-18 11:09:19     6  2022      NA  E2      
##  2 LU01  LU01_06 White-tailed… 2022-07-10 13:56:10     7  2022   31847. E3      
##  3 LU01  LU01_06 White-tailed… 2022-07-25 11:04:44     7  2022   21429. E4      
##  4 LU01  LU01_06 White-tailed… 2022-07-31 06:38:06     7  2022    8356. E5      
##  5 LU01  LU01_06 White-tailed… 2022-08-01 09:45:28     8  2022    1627. E6      
##  6 LU01  LU01_06 White-tailed… 2022-08-01 15:51:01     8  2022     364. E7      
##  7 LU01  LU01_06 White-tailed… 2022-08-05 06:49:48     8  2022    5218. E8      
##  8 LU01  LU01_06 White-tailed… 2022-08-26 08:36:07     8  2022   30345. E9      
##  9 LU01  LU01_06 White-tailed… 2022-10-03 10:19:01    10  2022   54819. E10     
## 10 LU01  LU01_06 White-tailed… 2022-10-03 15:36:14    10  2022     317. E11     
## # ℹ 5,889 more rows

Detection plots

Detection data

Now we can apply the same data formatting for each LUs’ data frame using purrr.

We want to count the number of independent detections per species per LU to use in the detection plots

# apply the same formatting to each LU data frame using purrr map
detection_data <- array_frames %>% 
  
  purrr::map(
    ~.x %>% 
      
      # group by species
      group_by(species) %>%
      
      # calculate a column with unique accounts of each species
      mutate(count = n_distinct(event_id)) %>% 
      
      # keep just the columns we need
      select(species, count) %>% 
      
      # keep only unique (distinct) rows so we should be left with one row per species, this helps with plotting later if you don't do it ggplot will try to count and plot each row it's annoying
      distinct()) %>% 
  
  # set names of list objects
  purrr::set_names('Detections LU01',
                   'Detections LU13',
                   'Detections LU15',
                   'Detections LU21')

Detection plots

Now to graph independent detections for each LU using purrr, this avoids a TON of code repetition needed to plot each one individually

We use purrr::imap() instead of purrr::map() because imap maintains the variable names in our list (e.g. Detections LU01, Detections LU13, etc.) which we can then use to title each plot.

Within purrr::imap() we just paste the code we would use for a single ggplot since all the graphical elements (except the title which we change with the file name [.y]) are the same

# create object detection plots which uses the detection_data list (w/ all 4 LUs)
detection_plots <- detection_data %>% 
  
  # use imap instead of map as it allows us to use .y to paste the list element names as the plot titles later
  purrr::imap(
    ~.x %>% 
      
      # now just copy and paste the ggplot code for the detection graphs
      ggplot(.,
             aes(x = reorder(species, count), y = count)) +
      
      # plot as bar graph using geom_col so we don't have to provide a y aesthetic
      geom_col() +
      
      # switch the x and y axis
      coord_flip() +
      
      # add the number of detections at the end of each bar
      geom_text(aes(label = count),
                color = "black",
                size = 3,
                hjust = -0.3,
                vjust = 0.2) +
      
      # label x and y axis with informative titles
      labs(x = 'Species',
           y = 'Number of Independent (30 min) Detections') +
      
      # add title to plot with LU name the .y will take the name of whatever you named each list element in the detection_data list, so make sure this name is what you want on the ggtitle
      ggtitle(.y) +
      
      # set the theme
      theme_classic() +
      theme(plot.title = element_text(hjust = 0.5)))

# view plots, this will print each in it's own window so you have to scroll back in the plot viewer pane to look at each one
detection_plots
## $`Detections LU01`

## 
## $`Detections LU13`

## 
## $`Detections LU15`

## 
## $`Detections LU21`

Save detection plots

Now we want to save these plots in case we need each individual one (we will combine the detection and naive occ plots into a single figure for each LU later and use those for the OSM report, but we may want these standalone plots later so let’s save them while they are here).

We can save all the plots from the purrr iteration above using purrr::imap. imap is used instead of map because it allows us to retain the list object names (plot names) to paste as the file name with the .y command.

IMPORTANT if you are using this code for a future github repo, DO NOT use .tiff as the file extension. This will cause issues when trying to push any changes to the github repo as the files are too large to meet githubs requirements

# save plots only use if needed
purrr::imap(
  detection_plots,
  ~ggsave(.x,
             file = paste0("figures/",
                           .y,
                           '.jpg'), # avoid using .tiff extension in the github repo, those files are too large to push to origin
          dpi = 600,
          width = 11,
          height = 9,
          units = 'in'))

Naive occupancy

Data

We also need to alter the detection data a bit to use for naive occupancy plots.

We will use the individual LU detection data like we did before and use purrr::map() to apply the dame data formatting to all 4 data frames.

Here we want to calculate the total number of sites in each LU, the number of sites each species was detected at in each LU and then use both those numbers to calculate naive occupancy for each species in each LU

# First we need to alter the data frame a bit for these plots, let's create a data frame for each LU (I couldn't figure out how to do this without assigning individual data frames for each UGH)


# apply the same formatting to each data frame using purrr
occupancy_data <- array_frames %>% 
  
  purrr::map(
    ~.x %>% 
      
      # calculate the total number of sites for each LU
      mutate(total_sites = n_distinct(site)) %>% 
      
      # group by species to calculate the number of sites each spp occurred at
      group_by(species) %>% 
  
      # add columns to count the number of sites each spp occurred at and then the naive occupancy
  reframe(count = n_distinct(site),
          naive_occ = count/total_sites,
          ind_det = n_distinct(event_id)) %>% 
  
    # keep just the columns we need
  select(species, naive_occ, ind_det) %>% 
  
    # keep only unique (distinct) rows so we should be left with one row per species, this helps with plotting
  distinct()) %>% 
  
  purrr::set_names('Naive Occupancy LU01',
                   'Naive Occupancy LU13',
                   'Naive Occupancy LU15',
                   'Naive Occupancy LU21')

Occupancy plots

Now we can graph naive occupancy for each LU using purrr, and as with the detection plots this saves a massive amount of coding using purrr to run an iteration on the data files and produce four plots at once instead of copying and pasting code for each individually

# create object occupancy_plots which uses the occupancy_data list (w/ all 4 LUs)
occupancy_plots <- occupancy_data %>% 
  
  # use imap instead of map as it allows us to use .y to paste the list element names as the plot titles later
  purrr::imap(
    ~.x %>% 

      # now just copy and paste the ggplot code for the occupancy graphs
      ggplot(.,
             aes(x = fct_reorder(species,
                                 ind_det), # this reorders the species so they match the order of the detection plot which makes it better for viewing when the plots are arranged together in 1 figure for each LU
                 y = naive_occ)) +
      
      # plot as bars using geom_col() which uses stat = 'identity', instead of geom_bar() which will count the rows in each group and plot that instead of naive occ
      geom_col() +
      
      # flip x and y axis 
      coord_flip() +
      
      # add text to end of bars that provides naive occ value
      geom_text(aes(label = round(naive_occ, 2)), 
                size = 3, 
                hjust = -0.3, 
                vjust = 0.2) +
      
      # relabel x and y axis and title
      labs(x = 'Species',
           y = 'Proportion of Sites With At Least One Detection') +
      
      # set plot title using .y (name of list object)
      ggtitle(.y) +
      
      # set. theme elements
      theme_classic()+
      theme(plot.title = element_text(hjust = 0.5)))

# view plots
occupancy_plots
## $`Naive Occupancy LU01`

## 
## $`Naive Occupancy LU13`

## 
## $`Naive Occupancy LU15`

## 
## $`Naive Occupancy LU21`

Save occupancy plots

As with the detection plots, we might want these individual plots later for something so we can use purrr::imap() to save them to the figures folder

Again avoid using the .tiff extension in github

# save plots 
purrr::imap(
  occupancy_plots,
  ~ggsave(.x,
          file = paste0("figures/",
                        .y,
                        '.jpg'), # avoid using .tiff extension in the github repo, those files are too large to push to origin
          dpi = 600,
          width = 11,
          height = 9,
          units = 'in'))

Final combined plots for report

The previous year’s report had a figure for each LU with the detections plot on the top and the occupancy plot on the bottom so we will recreate these for this year using ggarrange().

Unfortunately I could not figure out how to do this in purrr to reduce coding but luckily it isn’t too much repitition

# not sure I know how to do the following section in purrr just yet, but we've saved a ton of coding so far and it doesn't take much to arrange each of these individually

# LU1

# arrange the plots so each LU has a figure with detections on top and naive occ on bottom
LU1_det_occ_plots <- ggarrange(detection_plots$`Detections LU01`, occupancy_plots$`Naive Occupancy LU01`,
                               labels = c("A", "B"),
                               nrow = 2)

# view plot
LU1_det_occ_plots

# LU13

# arrange the plots so each LU has a figure with detections on top and naive occ on bottom
LU13_det_occ_plots <- ggarrange(detection_plots$`Detections LU13`, occupancy_plots$`Naive Occupancy LU13`,
                               labels = c("A", "B"),
                               nrow = 2)

# view plot
LU13_det_occ_plots

# LU15

# arrange the plots so each LU has a figure with detections on top and naive occ on bottom
LU15_det_occ_plots <- ggarrange(detection_plots$`Detections LU15`, occupancy_plots$`Naive Occupancy LU15`,
                                labels = c("A", "B"),
                                nrow = 2)

# view plot
LU15_det_occ_plots

# LU21

# arrange the plots so each LU has a figure with detections on top and naive occ on bottom
LU21_det_occ_plots <- ggarrange(detection_plots$`Detections LU21`, occupancy_plots$`Naive Occupancy LU21`,
                                labels = c("A", "B"),
                                nrow = 2)

# view plot
LU21_det_occ_plots

We can however, save all the figures again using purrr

# save all figures at once using purrr
final_det_occ_plots <- list(LU1_det_occ_plots,
                            LU13_det_occ_plots,
                            LU15_det_occ_plots,
                            LU21_det_occ_plots) %>% 
  

  purrr::set_names('LU01_det_occ_plots',
                   'LU13_det_occ_plots',
                   'LU15_det_occ_plots',
                   'LU21_det_occ_plots') %>% 
  
  purrr::imap(
    ~ggsave(.x,
            file = paste0("figures/",
                          .y,
                          '.jpg'), # avoid using .tiff extension in the github repo, those files are too large to push to origin
            dpi = 600,
            width = 12,
            height = 15,
            units = 'in'))

Analysis prep

Read in data

We need the proportional binomial data and the covariate data (from the ACME_camera_script_9-2-2024.R or .Rmd), let’s read those in now and check the structure of each

# response metric (proportional detections from the from the ACME_camera_script_9-2-2024.R or .Rmd)
prop_detections <- read_csv('data/processed/OSM_2022_proportional_detections.csv')
## Rows: 154 Columns: 25
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (1): site
## dbl (24): black_bear, coyote, fisher, moose, snowshoe_hare, white-tailed_dee...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# check variable structure
str(prop_detections)
## spc_tbl_ [154 × 25] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
##  $ site                    : chr [1:154] "LU01_06" "LU01_10" "LU01_11" "LU01_13" ...
##  $ black_bear              : num [1:154] 7 3 4 7 8 9 4 5 7 7 ...
##  $ coyote                  : num [1:154] 4 4 8 10 11 9 11 0 9 4 ...
##  $ fisher                  : num [1:154] 5 3 3 3 2 1 1 1 0 3 ...
##  $ moose                   : num [1:154] 3 2 5 9 1 0 2 4 1 0 ...
##  $ snowshoe_hare           : num [1:154] 4 1 3 0 8 2 2 0 12 4 ...
##  $ white-tailed_deer       : num [1:154] 12 5 12 12 13 14 15 9 12 10 ...
##  $ cougar                  : num [1:154] 0 0 1 0 1 0 0 0 0 0 ...
##  $ grey_wolf               : num [1:154] 0 0 2 0 0 0 1 0 0 0 ...
##  $ lynx                    : num [1:154] 0 0 1 0 1 1 0 0 0 2 ...
##  $ red_fox                 : num [1:154] 0 0 2 0 0 0 0 0 4 0 ...
##  $ wolverine               : num [1:154] 0 0 0 0 0 0 0 0 0 0 ...
##  $ caribou                 : num [1:154] 0 0 0 0 0 0 0 0 0 0 ...
##  $ absent_black_bear       : num [1:154] 5 3 8 5 4 3 8 7 5 5 ...
##  $ absent_coyote           : num [1:154] 10 1 6 5 3 5 4 15 6 11 ...
##  $ absent_fisher           : num [1:154] 9 2 11 12 12 13 14 14 15 12 ...
##  $ absent_moose            : num [1:154] 11 3 9 6 13 14 13 11 14 15 ...
##  $ absent_snowshoe_hare    : num [1:154] 10 4 11 15 6 12 13 15 3 11 ...
##  $ absent_white-tailed_deer: num [1:154] 2 0 2 3 1 0 0 6 3 5 ...
##  $ absent_cougar           : num [1:154] 14 5 13 15 13 14 15 15 15 15 ...
##  $ absent_grey_wolf        : num [1:154] 14 5 12 15 14 14 14 15 15 15 ...
##  $ absent_lynx             : num [1:154] 14 5 13 15 13 13 15 15 15 13 ...
##  $ absent_red_fox          : num [1:154] 14 5 12 15 14 14 15 15 11 15 ...
##  $ absent_wolverine        : num [1:154] 14 5 14 15 14 14 15 15 15 15 ...
##  $ absent_caribou          : num [1:154] 14 5 14 15 14 14 15 15 15 15 ...
##  - attr(*, "spec")=
##   .. cols(
##   ..   site = col_character(),
##   ..   black_bear = col_double(),
##   ..   coyote = col_double(),
##   ..   fisher = col_double(),
##   ..   moose = col_double(),
##   ..   snowshoe_hare = col_double(),
##   ..   `white-tailed_deer` = col_double(),
##   ..   cougar = col_double(),
##   ..   grey_wolf = col_double(),
##   ..   lynx = col_double(),
##   ..   red_fox = col_double(),
##   ..   wolverine = col_double(),
##   ..   caribou = col_double(),
##   ..   absent_black_bear = col_double(),
##   ..   absent_coyote = col_double(),
##   ..   absent_fisher = col_double(),
##   ..   absent_moose = col_double(),
##   ..   absent_snowshoe_hare = col_double(),
##   ..   `absent_white-tailed_deer` = col_double(),
##   ..   absent_cougar = col_double(),
##   ..   absent_grey_wolf = col_double(),
##   ..   absent_lynx = col_double(),
##   ..   absent_red_fox = col_double(),
##   ..   absent_wolverine = col_double(),
##   ..   absent_caribou = col_double()
##   .. )
##  - attr(*, "problems")=<externalptr>
# model covariates (merged HFI and VEG data from the ACME_camera_script_9-2-2024.R or .Rmd)
covariates <- read_csv('data/processed/OSM_2022_covariates.csv',
                       
                       # set the column types to read in correctly
                       col_types = cols(array = col_factor(),
                                        camera = col_factor(),
                                        site = col_factor(),
                                        buff_dist = col_factor(),
                                        .default = col_number()))

# check variable structure
str(covariates)
## spc_tbl_ [3,100 × 119] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
##  $ array                       : Factor w/ 4 levels "LU13","LU15",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ camera                      : Factor w/ 96 levels "18","15","03",..: 1 2 3 4 5 6 7 8 9 10 ...
##  $ site                        : Factor w/ 155 levels "LU13_18","LU13_15",..: 1 2 3 4 5 6 7 8 9 10 ...
##  $ buff_dist                   : Factor w/ 20 levels "250","500","750",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ vegetated_edge_roads        : num [1:3100] 0 0.0858 0 0 0 ...
##  $ harvest_area                : num [1:3100] 0 0 0.687 0.337 0 ...
##  $ road_gravel_1l              : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ conventional_seismic        : num [1:3100] 0 0.03276 0 0.00889 0.01145 ...
##  $ tame_pasture                : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ pipeline                    : num [1:3100] 0 0.068 0 0 0.0301 ...
##  $ road_gravel_2l              : num [1:3100] 0 0 0 0 0 ...
##  $ trail                       : num [1:3100] 0.00588 0.0028 0 0.00196 0 ...
##  $ well_bitumen                : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ rough_pasture               : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ well_aband                  : num [1:3100] 0 0 0 0 0.0322 ...
##  $ road_unclassified           : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ crop                        : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ low_impact_seismic          : num [1:3100] 0 0 0 0 0.0523 ...
##  $ clearing_unknown            : num [1:3100] 0.0923 0.0697 0 0 0 ...
##  $ cultivation_abandoned       : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ road_paved_undiv_2l         : num [1:3100] 0 0.0174 0 0 0 ...
##  $ road_unimproved             : num [1:3100] 0 0 0 0 0 ...
##  $ truck_trail                 : num [1:3100] 0 0 0 0.0139 0 ...
##  $ dugout                      : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ road_paved_undiv_1l         : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ well_gas                    : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ vegetated_edge_railways     : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ harvest_area_white_zone     : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ country_residence           : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ borrowpit_dry               : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ rural_residence             : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ borrowpit_wet               : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ borrowpits                  : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ grvl_sand_pit               : num [1:3100] 0 0.0873 0 0 0 ...
##  $ ris_reclaimed_temp          : num [1:3100] 0 0.0477 0 0 0 ...
##  $ ris_clearing_unknown        : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ ris_drainage                : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ ris_mines_oilsands          : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ ris_overburden_dump         : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ ris_facility_operations     : num [1:3100] 0 0 0 0 0 ...
##  $ transmission_line           : num [1:3100] 0.0642 0 0 0 0.091 ...
##  $ ris_tailing_pond            : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ clearing_wellpad_unconfirmed: num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ mines_oilsands              : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ ris_soil_replaced           : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ road_paved_1l               : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ ris_oilsands_rms            : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ ris_facility_unknown        : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ ris_borrowpits              : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ ris_transmission_line       : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ ris_soil_salvaged           : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ ris_road                    : num [1:3100] 0 0 0 0 0 ...
##  $ ris_plant                   : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ urban_residence             : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ facility_other              : num [1:3100] 0 0 0 0 0 ...
##  $ airp_runway                 : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ runway                      : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ ris_reclaimed_permanent     : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ urban_industrial            : num [1:3100] 0.291 0 0 0 0 ...
##  $ lagoon                      : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ facility_unknown            : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ residence_clearing          : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ well_cased                  : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ road_unpaved_2l             : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ road_paved_3l               : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ surrounding_veg             : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ rlwy_sgl_track              : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ road_winter                 : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ sump                        : num [1:3100] 0 0 0 0 0 ...
##  $ greenspace                  : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ road_paved_2l               : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ well_other                  : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ canal                       : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ reservoir                   : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ well_cleared_not_confirmed  : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ misc_oil_gas_facility       : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ camp_industrial             : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ ris_camp_industrial         : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ oil_gas_plant               : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ well_unknown                : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ ris_utilities               : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ cfo                         : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ recreation                  : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ campground                  : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ peat                        : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ golfcourse                  : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ landfill                    : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ transfer_station            : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ mill                        : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ road_paved_div              : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ rlwy_spur                   : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ well_cleared_not_drilled    : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ open_pit_mine               : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ well_oil                    : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ road_paved_4l               : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ mines_pitlake               : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ ris_reclaimed_certified     : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ ris_windrow                 : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ tailing_pond                : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##   [list output truncated]
##  - attr(*, "spec")=
##   .. cols(
##   ..   .default = col_number(),
##   ..   array = col_factor(levels = NULL, ordered = FALSE, include_na = FALSE),
##   ..   camera = col_factor(levels = NULL, ordered = FALSE, include_na = FALSE),
##   ..   site = col_factor(levels = NULL, ordered = FALSE, include_na = FALSE),
##   ..   buff_dist = col_factor(levels = NULL, ordered = FALSE, include_na = FALSE),
##   ..   vegetated_edge_roads = col_number(),
##   ..   harvest_area = col_number(),
##   ..   road_gravel_1l = col_number(),
##   ..   conventional_seismic = col_number(),
##   ..   tame_pasture = col_number(),
##   ..   pipeline = col_number(),
##   ..   road_gravel_2l = col_number(),
##   ..   trail = col_number(),
##   ..   well_bitumen = col_number(),
##   ..   rough_pasture = col_number(),
##   ..   well_aband = col_number(),
##   ..   road_unclassified = col_number(),
##   ..   crop = col_number(),
##   ..   low_impact_seismic = col_number(),
##   ..   clearing_unknown = col_number(),
##   ..   cultivation_abandoned = col_number(),
##   ..   road_paved_undiv_2l = col_number(),
##   ..   road_unimproved = col_number(),
##   ..   truck_trail = col_number(),
##   ..   dugout = col_number(),
##   ..   road_paved_undiv_1l = col_number(),
##   ..   well_gas = col_number(),
##   ..   vegetated_edge_railways = col_number(),
##   ..   harvest_area_white_zone = col_number(),
##   ..   country_residence = col_number(),
##   ..   borrowpit_dry = col_number(),
##   ..   rural_residence = col_number(),
##   ..   borrowpit_wet = col_number(),
##   ..   borrowpits = col_number(),
##   ..   grvl_sand_pit = col_number(),
##   ..   ris_reclaimed_temp = col_number(),
##   ..   ris_clearing_unknown = col_number(),
##   ..   ris_drainage = col_number(),
##   ..   ris_mines_oilsands = col_number(),
##   ..   ris_overburden_dump = col_number(),
##   ..   ris_facility_operations = col_number(),
##   ..   transmission_line = col_number(),
##   ..   ris_tailing_pond = col_number(),
##   ..   clearing_wellpad_unconfirmed = col_number(),
##   ..   mines_oilsands = col_number(),
##   ..   ris_soil_replaced = col_number(),
##   ..   road_paved_1l = col_number(),
##   ..   ris_oilsands_rms = col_number(),
##   ..   ris_facility_unknown = col_number(),
##   ..   ris_borrowpits = col_number(),
##   ..   ris_transmission_line = col_number(),
##   ..   ris_soil_salvaged = col_number(),
##   ..   ris_road = col_number(),
##   ..   ris_plant = col_number(),
##   ..   urban_residence = col_number(),
##   ..   facility_other = col_number(),
##   ..   airp_runway = col_number(),
##   ..   runway = col_number(),
##   ..   ris_reclaimed_permanent = col_number(),
##   ..   urban_industrial = col_number(),
##   ..   lagoon = col_number(),
##   ..   facility_unknown = col_number(),
##   ..   residence_clearing = col_number(),
##   ..   well_cased = col_number(),
##   ..   road_unpaved_2l = col_number(),
##   ..   road_paved_3l = col_number(),
##   ..   surrounding_veg = col_number(),
##   ..   rlwy_sgl_track = col_number(),
##   ..   road_winter = col_number(),
##   ..   sump = col_number(),
##   ..   greenspace = col_number(),
##   ..   road_paved_2l = col_number(),
##   ..   well_other = col_number(),
##   ..   canal = col_number(),
##   ..   reservoir = col_number(),
##   ..   well_cleared_not_confirmed = col_number(),
##   ..   misc_oil_gas_facility = col_number(),
##   ..   camp_industrial = col_number(),
##   ..   ris_camp_industrial = col_number(),
##   ..   oil_gas_plant = col_number(),
##   ..   well_unknown = col_number(),
##   ..   ris_utilities = col_number(),
##   ..   cfo = col_number(),
##   ..   recreation = col_number(),
##   ..   campground = col_number(),
##   ..   peat = col_number(),
##   ..   golfcourse = col_number(),
##   ..   landfill = col_number(),
##   ..   transfer_station = col_number(),
##   ..   mill = col_number(),
##   ..   road_paved_div = col_number(),
##   ..   rlwy_spur = col_number(),
##   ..   well_cleared_not_drilled = col_number(),
##   ..   open_pit_mine = col_number(),
##   ..   well_oil = col_number(),
##   ..   road_paved_4l = col_number(),
##   ..   mines_pitlake = col_number(),
##   ..   ris_reclaimed_certified = col_number(),
##   ..   ris_windrow = col_number(),
##   ..   tailing_pond = col_number(),
##   ..   rlwy_mlt_track = col_number(),
##   ..   rlwy_dbl_track = col_number(),
##   ..   ris_waste = col_number(),
##   ..   interchange_ramp = col_number(),
##   ..   road_paved_5l = col_number(),
##   ..   ris_airp_runway = col_number(),
##   ..   fruit_vegetables = col_number(),
##   ..   road_unpaved_1l = col_number(),
##   ..   ris_reclaim_ready = col_number(),
##   ..   ris_tank_farm = col_number(),
##   ..   lc_class20 = col_number(),
##   ..   lc_class32 = col_number(),
##   ..   lc_class33 = col_number(),
##   ..   lc_class34 = col_number(),
##   ..   lc_class50 = col_number(),
##   ..   lc_class110 = col_number(),
##   ..   lc_class120 = col_number(),
##   ..   lc_class210 = col_number(),
##   ..   lc_class220 = col_number(),
##   ..   lc_class230 = col_number()
##   .. )
##  - attr(*, "problems")=<externalptr>

Format covariates

There are too many covariates to include in the models individually and many of them describe similar HFI features. We can use the info from the README file in this repository which includes detailed descriptions from the ABMI human footprints wall to wall data download website for Year 2021 OR in the relevant_literature folder of this repository (HFI_2021_v1_0_Metadata_Final.pdf).

the current version of this code for the purposes of the 2022-2023 report used a merged dataset from 2021-2022 and 2022-2023 data, howver each year of data the variables were extracted slightly differenty from GIS so final version of this code will include a different formatting process which will likely occur in the ACME_camera_script_9-2-2024.R or .Rmd’

First lets order the columns alphabetically so we can look at descriptions for everything in the ABMI doc easier. We will want the non-covariate columns (i.e., array, site, camera, buffer_dsit) at the front so we can use relocate after we order all of the columns to move these four to the front of the data.

covariates <- covariates %>% 
  
  # order columns alphabetically
  select(order(colnames(.))) %>% 
  
  # we want to move the columns that aren't HFI features or landcover to the front
  relocate(.,
           c(array,
             site,
             camera,
             buff_dist))  

# get a list of column names to ensure it worked
names(covariates)
##   [1] "array"                        "site"                        
##   [3] "camera"                       "buff_dist"                   
##   [5] "airp_runway"                  "borrowpit_dry"               
##   [7] "borrowpit_wet"                "borrowpits"                  
##   [9] "camp_industrial"              "campground"                  
##  [11] "canal"                        "cfo"                         
##  [13] "clearing_unknown"             "clearing_wellpad_unconfirmed"
##  [15] "conventional_seismic"         "country_residence"           
##  [17] "crop"                         "cultivation_abandoned"       
##  [19] "dugout"                       "facility_other"              
##  [21] "facility_unknown"             "fruit_vegetables"            
##  [23] "golfcourse"                   "greenspace"                  
##  [25] "grvl_sand_pit"                "harvest_area"                
##  [27] "harvest_area_white_zone"      "interchange_ramp"            
##  [29] "lagoon"                       "landfill"                    
##  [31] "lc_class110"                  "lc_class120"                 
##  [33] "lc_class20"                   "lc_class210"                 
##  [35] "lc_class220"                  "lc_class230"                 
##  [37] "lc_class32"                   "lc_class33"                  
##  [39] "lc_class34"                   "lc_class50"                  
##  [41] "low_impact_seismic"           "mill"                        
##  [43] "mines_oilsands"               "mines_pitlake"               
##  [45] "misc_oil_gas_facility"        "oil_gas_plant"               
##  [47] "open_pit_mine"                "peat"                        
##  [49] "pipeline"                     "recreation"                  
##  [51] "reservoir"                    "residence_clearing"          
##  [53] "ris_airp_runway"              "ris_borrowpits"              
##  [55] "ris_camp_industrial"          "ris_clearing_unknown"        
##  [57] "ris_drainage"                 "ris_facility_operations"     
##  [59] "ris_facility_unknown"         "ris_mines_oilsands"          
##  [61] "ris_oilsands_rms"             "ris_overburden_dump"         
##  [63] "ris_plant"                    "ris_reclaim_ready"           
##  [65] "ris_reclaimed_certified"      "ris_reclaimed_permanent"     
##  [67] "ris_reclaimed_temp"           "ris_road"                    
##  [69] "ris_soil_replaced"            "ris_soil_salvaged"           
##  [71] "ris_tailing_pond"             "ris_tank_farm"               
##  [73] "ris_transmission_line"        "ris_utilities"               
##  [75] "ris_waste"                    "ris_windrow"                 
##  [77] "rlwy_dbl_track"               "rlwy_mlt_track"              
##  [79] "rlwy_sgl_track"               "rlwy_spur"                   
##  [81] "road_gravel_1l"               "road_gravel_2l"              
##  [83] "road_paved_1l"                "road_paved_2l"               
##  [85] "road_paved_3l"                "road_paved_4l"               
##  [87] "road_paved_5l"                "road_paved_div"              
##  [89] "road_paved_undiv_1l"          "road_paved_undiv_2l"         
##  [91] "road_unclassified"            "road_unimproved"             
##  [93] "road_unpaved_1l"              "road_unpaved_2l"             
##  [95] "road_winter"                  "rough_pasture"               
##  [97] "runway"                       "rural_residence"             
##  [99] "sump"                         "surrounding_veg"             
## [101] "tailing_pond"                 "tame_pasture"                
## [103] "trail"                        "transfer_station"            
## [105] "transmission_line"            "truck_trail"                 
## [107] "urban_industrial"             "urban_residence"             
## [109] "vegetated_edge_railways"      "vegetated_edge_roads"        
## [111] "well_aband"                   "well_bitumen"                
## [113] "well_cased"                   "well_cleared_not_confirmed"  
## [115] "well_cleared_not_drilled"     "well_gas"                    
## [117] "well_oil"                     "well_other"                  
## [119] "well_unknown"

Let’s get a summary of each variable now, and lets filter by just the 1000m buffer width so we don’t have a bunch of repeated data for each buffer width at each site, this will give us general insights into how much variability we have with each feature at a general buffer width. *You can change this if you are interested in a different bufffer width specifically, or if it makes more since to see the data for the min (250m) or max (5000m) buffer width.

covariates %>% 
  # filter to just buffer 1000 m
  filter(buff_dist == 1000) %>% 
  
  summary(.)
##   array         site         camera      buff_dist    airp_runway
##  LU13:41   LU13_18:  1   27     :  4   1000   :155   Min.   :0   
##  LU15:39   LU13_15:  1   32     :  4   250    :  0   1st Qu.:0   
##  LU21:36   LU13_03:  1   41     :  4   500    :  0   Median :0   
##  LU01:39   LU13_34:  1   36     :  4   750    :  0   Mean   :0   
##            LU13_57:  1   16     :  3   1250   :  0   3rd Qu.:0   
##            LU13_16:  1   21     :  3   1500   :  0   Max.   :0   
##            (Other):149   (Other):133   (Other):  0               
##  borrowpit_dry       borrowpit_wet         borrowpits       
##  Min.   :0.0000000   Min.   :0.0000000   Min.   :0.0000000  
##  1st Qu.:0.0000000   1st Qu.:0.0000000   1st Qu.:0.0000000  
##  Median :0.0000000   Median :0.0000000   Median :0.0000000  
##  Mean   :0.0009388   Mean   :0.0006446   Mean   :0.0002542  
##  3rd Qu.:0.0000000   3rd Qu.:0.0000000   3rd Qu.:0.0000000  
##  Max.   :0.0300351   Max.   :0.0198622   Max.   :0.0072821  
##                                                             
##  camp_industrial       campground     canal        cfo    clearing_unknown  
##  Min.   :0.0000000   Min.   :0    Min.   :0   Min.   :0   Min.   :0.000000  
##  1st Qu.:0.0000000   1st Qu.:0    1st Qu.:0   1st Qu.:0   1st Qu.:0.000000  
##  Median :0.0000000   Median :0    Median :0   Median :0   Median :0.000000  
##  Mean   :0.0003785   Mean   :0    Mean   :0   Mean   :0   Mean   :0.006422  
##  3rd Qu.:0.0000000   3rd Qu.:0    3rd Qu.:0   3rd Qu.:0   3rd Qu.:0.001654  
##  Max.   :0.0160772   Max.   :0    Max.   :0   Max.   :0   Max.   :0.182912  
##                                                                             
##  clearing_wellpad_unconfirmed conventional_seismic country_residence  
##  Min.   :0.0000000            Min.   :0.000000     Min.   :0.0000000  
##  1st Qu.:0.0000000            1st Qu.:0.002498     1st Qu.:0.0000000  
##  Median :0.0000000            Median :0.005202     Median :0.0000000  
##  Mean   :0.0004428            Mean   :0.006143     Mean   :0.0000828  
##  3rd Qu.:0.0000000            3rd Qu.:0.009577     3rd Qu.:0.0000000  
##  Max.   :0.0117571            Max.   :0.020381     Max.   :0.0128340  
##                                                                       
##       crop   cultivation_abandoned     dugout  facility_other    
##  Min.   :0   Min.   :0.000e+00     Min.   :0   Min.   :0.000000  
##  1st Qu.:0   1st Qu.:0.000e+00     1st Qu.:0   1st Qu.:0.000000  
##  Median :0   Median :0.000e+00     Median :0   Median :0.000000  
##  Mean   :0   Mean   :5.408e-05     Mean   :0   Mean   :0.001119  
##  3rd Qu.:0   3rd Qu.:0.000e+00     3rd Qu.:0   3rd Qu.:0.000000  
##  Max.   :0   Max.   :8.383e-03     Max.   :0   Max.   :0.062266  
##                                                                  
##  facility_unknown    fruit_vegetables   golfcourse   greenspace
##  Min.   :0.000e+00   Min.   :0        Min.   :0    Min.   :0   
##  1st Qu.:0.000e+00   1st Qu.:0        1st Qu.:0    1st Qu.:0   
##  Median :0.000e+00   Median :0        Median :0    Median :0   
##  Mean   :5.746e-05   Mean   :0        Mean   :0    Mean   :0   
##  3rd Qu.:0.000e+00   3rd Qu.:0        3rd Qu.:0    3rd Qu.:0   
##  Max.   :3.281e-03   Max.   :0        Max.   :0    Max.   :0   
##                                                                
##  grvl_sand_pit       harvest_area     harvest_area_white_zone interchange_ramp
##  Min.   :0.000000   Min.   :0.00000   Min.   :0               Min.   :0       
##  1st Qu.:0.000000   1st Qu.:0.00000   1st Qu.:0               1st Qu.:0       
##  Median :0.000000   Median :0.00000   Median :0               Median :0       
##  Mean   :0.003109   Mean   :0.02293   Mean   :0               Mean   :0       
##  3rd Qu.:0.000000   3rd Qu.:0.00000   3rd Qu.:0               3rd Qu.:0       
##  Max.   :0.109732   Max.   :0.42899   Max.   :0               Max.   :0       
##                                                                               
##      lagoon             landfill  lc_class110        lc_class120       
##  Min.   :0.0000000   Min.   :0   Min.   :0.000000   Min.   :0.000e+00  
##  1st Qu.:0.0000000   1st Qu.:0   1st Qu.:0.004449   1st Qu.:0.000e+00  
##  Median :0.0000000   Median :0   Median :0.046414   Median :0.000e+00  
##  Mean   :0.0002406   Mean   :0   Mean   :0.054946   Mean   :3.878e-06  
##  3rd Qu.:0.0000000   3rd Qu.:0   3rd Qu.:0.082004   3rd Qu.:0.000e+00  
##  Max.   :0.0126573   Max.   :0   Max.   :0.231159   Max.   :6.011e-04  
##                                                                        
##    lc_class20       lc_class210      lc_class220       lc_class230     
##  Min.   :0.00000   Min.   :0.0000   Min.   :0.00000   Min.   :0.00000  
##  1st Qu.:0.00000   1st Qu.:0.4659   1st Qu.:0.00000   1st Qu.:0.00000  
##  Median :0.00000   Median :0.7228   Median :0.01425   Median :0.02315  
##  Mean   :0.02123   Mean   :0.6400   Mean   :0.10735   Mean   :0.06363  
##  3rd Qu.:0.00000   3rd Qu.:0.8433   3rd Qu.:0.16066   3rd Qu.:0.08982  
##  Max.   :0.38025   Max.   :0.9858   Max.   :0.84274   Max.   :0.47473  
##                                                                        
##    lc_class32   lc_class33         lc_class34        lc_class50     
##  Min.   :0    Min.   :0.000000   Min.   :0.00000   Min.   :0.00000  
##  1st Qu.:0    1st Qu.:0.000000   1st Qu.:0.00000   1st Qu.:0.01205  
##  Median :0    Median :0.000000   Median :0.00000   Median :0.03874  
##  Mean   :0    Mean   :0.004366   Mean   :0.03862   Mean   :0.06991  
##  3rd Qu.:0    3rd Qu.:0.000000   3rd Qu.:0.05870   3rd Qu.:0.09848  
##  Max.   :0    Max.   :0.243332   Max.   :0.25234   Max.   :0.55986  
##                                                                     
##  low_impact_seismic      mill   mines_oilsands mines_pitlake
##  Min.   :0.000000   Min.   :0   Min.   :0      Min.   :0    
##  1st Qu.:0.000000   1st Qu.:0   1st Qu.:0      1st Qu.:0    
##  Median :0.000000   Median :0   Median :0      Median :0    
##  Mean   :0.004172   Mean   :0   Mean   :0      Mean   :0    
##  3rd Qu.:0.000063   3rd Qu.:0   3rd Qu.:0      3rd Qu.:0    
##  Max.   :0.060391   Max.   :0   Max.   :0      Max.   :0    
##                                                             
##  misc_oil_gas_facility oil_gas_plant      open_pit_mine            peat  
##  Min.   :0.000000      Min.   :0.000000   Min.   :0.0000000   Min.   :0  
##  1st Qu.:0.000000      1st Qu.:0.000000   1st Qu.:0.0000000   1st Qu.:0  
##  Median :0.000000      Median :0.000000   Median :0.0000000   Median :0  
##  Mean   :0.002912      Mean   :0.001167   Mean   :0.0005665   Mean   :0  
##  3rd Qu.:0.000000      3rd Qu.:0.000000   3rd Qu.:0.0000000   3rd Qu.:0  
##  Max.   :0.107208      Max.   :0.071271   Max.   :0.0389603   Max.   :0  
##                                                                          
##     pipeline         recreation   reservoir         residence_clearing
##  Min.   :0.00000   Min.   :0    Min.   :0.000e+00   Min.   :0         
##  1st Qu.:0.00000   1st Qu.:0    1st Qu.:0.000e+00   1st Qu.:0         
##  Median :0.02243   Median :0    Median :0.000e+00   Median :0         
##  Mean   :0.02699   Mean   :0    Mean   :2.865e-05   Mean   :0         
##  3rd Qu.:0.03776   3rd Qu.:0    3rd Qu.:0.000e+00   3rd Qu.:0         
##  Max.   :0.12204   Max.   :0    Max.   :4.441e-03   Max.   :0         
##                                                                       
##  ris_airp_runway ris_borrowpits ris_camp_industrial ris_clearing_unknown
##  Min.   :0       Min.   :0      Min.   :0           Min.   :0           
##  1st Qu.:0       1st Qu.:0      1st Qu.:0           1st Qu.:0           
##  Median :0       Median :0      Median :0           Median :0           
##  Mean   :0       Mean   :0      Mean   :0           Mean   :0           
##  3rd Qu.:0       3rd Qu.:0      3rd Qu.:0           3rd Qu.:0           
##  Max.   :0       Max.   :0      Max.   :0           Max.   :0           
##                                                                         
##   ris_drainage ris_facility_operations ris_facility_unknown ris_mines_oilsands
##  Min.   :0     Min.   :0.0000000       Min.   :0            Min.   :0         
##  1st Qu.:0     1st Qu.:0.0000000       1st Qu.:0            1st Qu.:0         
##  Median :0     Median :0.0000000       Median :0            Median :0         
##  Mean   :0     Mean   :0.0003528       Mean   :0            Mean   :0         
##  3rd Qu.:0     3rd Qu.:0.0000000       3rd Qu.:0            3rd Qu.:0         
##  Max.   :0     Max.   :0.0546781       Max.   :0            Max.   :0         
##                                                                               
##  ris_oilsands_rms ris_overburden_dump   ris_plant ris_reclaim_ready
##  Min.   :0        Min.   :0           Min.   :0   Min.   :0        
##  1st Qu.:0        1st Qu.:0           1st Qu.:0   1st Qu.:0        
##  Median :0        Median :0           Median :0   Median :0        
##  Mean   :0        Mean   :0           Mean   :0   Mean   :0        
##  3rd Qu.:0        3rd Qu.:0           3rd Qu.:0   3rd Qu.:0        
##  Max.   :0        Max.   :0           Max.   :0   Max.   :0        
##                                                                    
##  ris_reclaimed_certified ris_reclaimed_permanent ris_reclaimed_temp
##  Min.   :0               Min.   :0.0000000       Min.   :0.000000  
##  1st Qu.:0               1st Qu.:0.0000000       1st Qu.:0.000000  
##  Median :0               Median :0.0000000       Median :0.000000  
##  Mean   :0               Mean   :0.0002803       Mean   :0.000318  
##  3rd Qu.:0               3rd Qu.:0.0000000       3rd Qu.:0.000000  
##  Max.   :0               Max.   :0.0434483       Max.   :0.016762  
##                                                                    
##     ris_road         ris_soil_replaced ris_soil_salvaged ris_tailing_pond   
##  Min.   :0.0000000   Min.   :0         Min.   :0         Min.   :0.0000000  
##  1st Qu.:0.0000000   1st Qu.:0         1st Qu.:0         1st Qu.:0.0000000  
##  Median :0.0000000   Median :0         Median :0         Median :0.0000000  
##  Mean   :0.0000302   Mean   :0         Mean   :0         Mean   :0.0009116  
##  3rd Qu.:0.0000000   3rd Qu.:0         3rd Qu.:0         3rd Qu.:0.0000000  
##  Max.   :0.0046809   Max.   :0         Max.   :0         Max.   :0.1413014  
##                                                                             
##  ris_tank_farm ris_transmission_line ris_utilities   ris_waste  ris_windrow
##  Min.   :0     Min.   :0             Min.   :0     Min.   :0   Min.   :0   
##  1st Qu.:0     1st Qu.:0             1st Qu.:0     1st Qu.:0   1st Qu.:0   
##  Median :0     Median :0             Median :0     Median :0   Median :0   
##  Mean   :0     Mean   :0             Mean   :0     Mean   :0   Mean   :0   
##  3rd Qu.:0     3rd Qu.:0             3rd Qu.:0     3rd Qu.:0   3rd Qu.:0   
##  Max.   :0     Max.   :0             Max.   :0     Max.   :0   Max.   :0   
##                                                                            
##  rlwy_dbl_track rlwy_mlt_track rlwy_sgl_track   rlwy_spur road_gravel_1l    
##  Min.   :0      Min.   :0      Min.   :0      Min.   :0   Min.   :0.000000  
##  1st Qu.:0      1st Qu.:0      1st Qu.:0      1st Qu.:0   1st Qu.:0.000000  
##  Median :0      Median :0      Median :0      Median :0   Median :0.004254  
##  Mean   :0      Mean   :0      Mean   :0      Mean   :0   Mean   :0.004548  
##  3rd Qu.:0      3rd Qu.:0      3rd Qu.:0      3rd Qu.:0   3rd Qu.:0.007252  
##  Max.   :0      Max.   :0      Max.   :0      Max.   :0   Max.   :0.022773  
##                                                                             
##  road_gravel_2l     road_paved_1l road_paved_2l road_paved_3l road_paved_4l
##  Min.   :0.000000   Min.   :0     Min.   :0     Min.   :0     Min.   :0    
##  1st Qu.:0.000000   1st Qu.:0     1st Qu.:0     1st Qu.:0     1st Qu.:0    
##  Median :0.000000   Median :0     Median :0     Median :0     Median :0    
##  Mean   :0.001748   Mean   :0     Mean   :0     Mean   :0     Mean   :0    
##  3rd Qu.:0.000000   3rd Qu.:0     3rd Qu.:0     3rd Qu.:0     3rd Qu.:0    
##  Max.   :0.015867   Max.   :0     Max.   :0     Max.   :0     Max.   :0    
##                                                                            
##  road_paved_5l road_paved_div road_paved_undiv_1l road_paved_undiv_2l
##  Min.   :0     Min.   :0      Min.   :0.0000000   Min.   :0.0000000  
##  1st Qu.:0     1st Qu.:0      1st Qu.:0.0000000   1st Qu.:0.0000000  
##  Median :0     Median :0      Median :0.0000000   Median :0.0000000  
##  Mean   :0     Mean   :0      Mean   :0.0001162   Mean   :0.0005722  
##  3rd Qu.:0     3rd Qu.:0      3rd Qu.:0.0000000   3rd Qu.:0.0000000  
##  Max.   :0     Max.   :0      Max.   :0.0085401   Max.   :0.0118399  
##                                                                      
##  road_unclassified  road_unimproved    road_unpaved_1l road_unpaved_2l
##  Min.   :0.00e+00   Min.   :0.000000   Min.   :0       Min.   :0      
##  1st Qu.:0.00e+00   1st Qu.:0.000000   1st Qu.:0       1st Qu.:0      
##  Median :0.00e+00   Median :0.000000   Median :0       Median :0      
##  Mean   :2.20e-06   Mean   :0.001069   Mean   :0       Mean   :0      
##  3rd Qu.:0.00e+00   3rd Qu.:0.001017   3rd Qu.:0       3rd Qu.:0      
##  Max.   :3.41e-04   Max.   :0.010709   Max.   :0       Max.   :0      
##                                                                       
##   road_winter rough_pasture           runway          rural_residence    
##  Min.   :0    Min.   :0.0000000   Min.   :0.000e+00   Min.   :0.000e+00  
##  1st Qu.:0    1st Qu.:0.0000000   1st Qu.:0.000e+00   1st Qu.:0.000e+00  
##  Median :0    Median :0.0000000   Median :0.000e+00   Median :0.000e+00  
##  Mean   :0    Mean   :0.0001776   Mean   :9.358e-05   Mean   :5.795e-06  
##  3rd Qu.:0    3rd Qu.:0.0000000   3rd Qu.:0.000e+00   3rd Qu.:0.000e+00  
##  Max.   :0    Max.   :0.0149983   Max.   :1.451e-02   Max.   :8.982e-04  
##                                                                          
##       sump          surrounding_veg  tailing_pond  tame_pasture      
##  Min.   :0.000000   Min.   :0       Min.   :0     Min.   :0.000e+00  
##  1st Qu.:0.000000   1st Qu.:0       1st Qu.:0     1st Qu.:0.000e+00  
##  Median :0.000000   Median :0       Median :0     Median :0.000e+00  
##  Mean   :0.003364   Mean   :0       Mean   :0     Mean   :4.727e-06  
##  3rd Qu.:0.002012   3rd Qu.:0       3rd Qu.:0     3rd Qu.:0.000e+00  
##  Max.   :0.033997   Max.   :0       Max.   :0     Max.   :7.326e-04  
##                                                                      
##      trail           transfer_station transmission_line   truck_trail       
##  Min.   :0.0000000   Min.   :0        Min.   :0.000000   Min.   :0.0000000  
##  1st Qu.:0.0000000   1st Qu.:0        1st Qu.:0.000000   1st Qu.:0.0000000  
##  Median :0.0001657   Median :0        Median :0.000000   Median :0.0000000  
##  Mean   :0.0009478   Mean   :0        Mean   :0.007669   Mean   :0.0008284  
##  3rd Qu.:0.0015165   3rd Qu.:0        3rd Qu.:0.000000   3rd Qu.:0.0000000  
##  Max.   :0.0068343   Max.   :0        Max.   :0.070051   Max.   :0.0149490  
##                                                                             
##  urban_industrial   urban_residence vegetated_edge_railways
##  Min.   :0.000000   Min.   :0       Min.   :0              
##  1st Qu.:0.000000   1st Qu.:0       1st Qu.:0              
##  Median :0.000000   Median :0       Median :0              
##  Mean   :0.002782   Mean   :0       Mean   :0              
##  3rd Qu.:0.000000   3rd Qu.:0       3rd Qu.:0              
##  Max.   :0.215891   Max.   :0       Max.   :0              
##                                                            
##  vegetated_edge_roads   well_aband        well_bitumen        well_cased       
##  Min.   :0.00000      Min.   :0.000000   Min.   :0.000000   Min.   :0.0000000  
##  1st Qu.:0.00379      1st Qu.:0.000000   1st Qu.:0.000000   1st Qu.:0.0000000  
##  Median :0.01016      Median :0.001888   Median :0.000000   Median :0.0000000  
##  Mean   :0.01569      Mean   :0.004932   Mean   :0.009243   Mean   :0.0001615  
##  3rd Qu.:0.02866      3rd Qu.:0.007000   3rd Qu.:0.012968   3rd Qu.:0.0000000  
##  Max.   :0.06275      Max.   :0.042874   Max.   :0.083850   Max.   :0.0071111  
##                                                                                
##  well_cleared_not_confirmed well_cleared_not_drilled    well_gas        
##  Min.   :0.0000000          Min.   :0                Min.   :0.000e+00  
##  1st Qu.:0.0000000          1st Qu.:0                1st Qu.:0.000e+00  
##  Median :0.0000000          Median :0                Median :0.000e+00  
##  Mean   :0.0006285          Mean   :0                Mean   :8.574e-05  
##  3rd Qu.:0.0000000          3rd Qu.:0                3rd Qu.:0.000e+00  
##  Max.   :0.0365581          Max.   :0                Max.   :2.579e-03  
##                                                                         
##     well_oil   well_other        well_unknown
##  Min.   :0   Min.   :0.000000   Min.   :0    
##  1st Qu.:0   1st Qu.:0.000000   1st Qu.:0    
##  Median :0   Median :0.000000   Median :0    
##  Mean   :0   Mean   :0.001517   Mean   :0    
##  3rd Qu.:0   3rd Qu.:0.000000   3rd Qu.:0    
##  Max.   :0   Max.   :0.030134   Max.   :0    
## 

Let’s also plot histograms of each variable for data visualization in a for loop, I wanted to do this for just one buffer size to reduce replicates but it will also drop any variables for which all the data are zeros, so you could explore this at different buffer widths or just remove the filter function and look at all the data which is what I do below once it is grouped

# filter to just one buffer width

covariates_1000 <- covariates %>%  
  
  filter(buff_dist == 1000)

for (col in 1:ncol(covariates_1000)) {
    hist(covariates_1000[,col])
}

Now we can use the information from the previous few steps as well as the variable descriptions from the ABMI human footprints wall to wall data download website for Year 2021 which is stored in the ‘relevant literature’ portion of this document AND also copied into the README file, to group the covariates so we reduce the number of potential variables to explore in the modeling phase.

We will use the mutate() function with some tidyverse trickery (i.e., nesting across() and contains() in rowsums()) to sum across each observation (row) by searching for various character strings. If there isn’t a common character string for multiple variables we want to sum then we provide each one individually. We can also combine these methods (e.g., with ‘facilities’ [see code]).

covariates_grouped <- covariates %>% 
  
  # rename 'vegetated_edge_roads so that we can use road as keyword to group roads without including this feature
  rename('vegetated_edge_rds' = vegetated_edge_roads) %>% 
  
  # within the mutate function create new column names for the grouped variables
  mutate(
    # borrowpits
    borrowpits = rowSums(across(contains('borrowpit'))) + # here we use rowsums with across() and contains() to sum acrross each row any values for columns that contain the keyword above. Be careful when using that there aren't any variables that match the string (keyword) provided that you don't want to include!
      
      dugout +
      lagoon +
      sump,
    
    
    # clearings
    clearings = rowSums(across(contains('clearing'))) +
      runway,
    
    # cultivations
    cultivation = crop + 
      cultivation_abandoned +
      fruit_vegetables +
      rough_pasture +
      tame_pasture,
    
    # harvest areas
    harvest = rowSums(across(contains('harvest'))),
    
    # industrial facilities
    facilities = rowSums(across(contains('facility'))) +
      rowSums(across(contains('plant'))) +
      camp_industrial +
      mill +
      ris_camp_industrial +
      ris_tank_farm +
      ris_utilities +
      urban_industrial,
    
    # mine areas
    mines = rowSums(across(contains('mine'))) +
      rowSums(across(contains('tailing'))) +
      grvl_sand_pit +
      peat +
      ris_drainage +
      ris_oilsands_rms +
      ris_overburden_dump +
      ris_reclaim_ready +
      ris_soil_salvaged +
      ris_waste,
    
    # railways
    railways = rowSums(across(contains('rlwy'))),
    
    # reclaimed areas
    reclaimed = rowSums(across(contains('reclaimed'))) +
      ris_soil_replaced +
      ris_windrow,
    
    # recreation areas
    recreation = campground +
      golfcourse +
      greenspace +
      recreation,
    
    # residential areas (can't use residence as keyword because 'residence_clearing' is in clearing unless we rearrange groupings or rename that one)
    residential = country_residence +
      rural_residence +
      urban_residence,
    
    # roads (we renamed 'vegetated_edge_roads' above to 'vegetated_edge_rds' so we can use roads as keyword here which saves a bunch of coding as there are many many road variables)
    roads = rowSums(across(contains('road'))) +
      interchange_ramp +
      airp_runway +
      ris_airp_runway,
    
    # seismic lines
    seismic_lines = rowSums(across(contains('seismic'))),
    
    # transmission lines
    transmission_lines = rowSums(across(contains('transmission'))),
    
    # trails
    trails = rowSums(across(contains('trail'))),
    
    # vegetated edges
    veg_edges = rowSums(across(contains('vegetated'))) +
      surrounding_veg,
    
    # man-made water features
    water = canal +
      reservoir,
    
    # well sites (this probably includes 'clearing_wellpad' need to check)
    wells = rowSums(across(contains('well'))),
    
    # remove columns that were used to create new columns to tidy the data frame
         .keep = 'unused') %>% 
  
  # reorder variables so the veg data is after all the HFI data
  relocate(starts_with('lc_class'),
           .after = wells)

# see what's left
names(covariates_grouped)
##  [1] "array"              "site"               "camera"            
##  [4] "buff_dist"          "borrowpits"         "cfo"               
##  [7] "landfill"           "pipeline"           "recreation"        
## [10] "transfer_station"   "clearings"          "cultivation"       
## [13] "harvest"            "facilities"         "mines"             
## [16] "railways"           "reclaimed"          "residential"       
## [19] "roads"              "seismic_lines"      "transmission_lines"
## [22] "trails"             "veg_edges"          "water"             
## [25] "wells"              "lc_class110"        "lc_class120"       
## [28] "lc_class20"         "lc_class210"        "lc_class220"       
## [31] "lc_class230"        "lc_class32"         "lc_class33"        
## [34] "lc_class34"         "lc_class50"
# check the structure of new data
str(covariates_grouped)
## tibble [3,100 × 35] (S3: tbl_df/tbl/data.frame)
##  $ array             : Factor w/ 4 levels "LU13","LU15",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ site              : Factor w/ 155 levels "LU13_18","LU13_15",..: 1 2 3 4 5 6 7 8 9 10 ...
##  $ camera            : Factor w/ 96 levels "18","15","03",..: 1 2 3 4 5 6 7 8 9 10 ...
##  $ buff_dist         : Factor w/ 20 levels "250","500","750",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ borrowpits        : num [1:3100] 0 0 0 0 0 ...
##  $ cfo               : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ landfill          : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ pipeline          : num [1:3100] 0 0.068 0 0 0.0301 ...
##  $ recreation        : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ transfer_station  : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ clearings         : num [1:3100] 0.0923 0.0697 0 0 0 ...
##  $ cultivation       : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ harvest           : num [1:3100] 0 0 0.687 0.337 0 ...
##  $ facilities        : num [1:3100] 0.291 0 0 0 0 ...
##  $ mines             : num [1:3100] 0 0.0873 0 0 0 ...
##  $ railways          : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ reclaimed         : num [1:3100] 0 0.0477 0 0 0 ...
##  $ residential       : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ roads             : num [1:3100] 0 0.0174 0 0 0 ...
##  $ seismic_lines     : num [1:3100] 0 0.03276 0 0.00889 0.06377 ...
##  $ transmission_lines: num [1:3100] 0.0642 0 0 0 0.091 ...
##  $ trails            : num [1:3100] 0.00588 0.0028 0 0.01591 0 ...
##  $ veg_edges         : num [1:3100] 0 0.0858 0 0 0 ...
##  $ water             : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ wells             : num [1:3100] 0 0 0 0 0.0322 ...
##  $ lc_class110       : num [1:3100] 0.193 0.348 0 0 0.178 ...
##  $ lc_class120       : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ lc_class20        : num [1:3100] 0.0361 0 0 0 0 ...
##  $ lc_class210       : num [1:3100] 0.456 0.358 0.186 1 0.822 ...
##  $ lc_class220       : num [1:3100] 0 0 0 0 0 ...
##  $ lc_class230       : num [1:3100] 0 0.101 0.255 0 0 ...
##  $ lc_class32        : num [1:3100] 0 0 0 0 0 0 0 0 0 0 ...
##  $ lc_class33        : num [1:3100] 0 0.101 0 0 0 ...
##  $ lc_class34        : num [1:3100] 0 0.0916 0 0 0 ...
##  $ lc_class50        : num [1:3100] 0.316 0 0.559 0 0 ...
# check summary of new data
summary(covariates_grouped)
##   array          site          camera       buff_dist      borrowpits      
##  LU13:820   LU13_18:  20   27     :  80   250    : 155   Min.   :0.000000  
##  LU15:780   LU13_15:  20   32     :  80   500    : 155   1st Qu.:0.000000  
##  LU21:720   LU13_03:  20   41     :  80   750    : 155   Median :0.001649  
##  LU01:780   LU13_34:  20   36     :  80   1000   : 155   Mean   :0.004302  
##             LU13_57:  20   16     :  60   1250   : 155   3rd Qu.:0.004453  
##             LU13_16:  20   21     :  60   1500   : 155   Max.   :0.310957  
##             (Other):2980   (Other):2660   (Other):2170                     
##       cfo               landfill    pipeline         recreation       
##  Min.   :0.000e+00   Min.   :0   Min.   :0.00000   Min.   :0.000e+00  
##  1st Qu.:0.000e+00   1st Qu.:0   1st Qu.:0.00000   1st Qu.:0.000e+00  
##  Median :0.000e+00   Median :0   Median :0.01350   Median :0.000e+00  
##  Mean   :8.077e-07   Mean   :0   Mean   :0.01937   Mean   :4.904e-05  
##  3rd Qu.:0.000e+00   3rd Qu.:0   3rd Qu.:0.02812   3rd Qu.:0.000e+00  
##  Max.   :1.215e-03   Max.   :0   Max.   :0.28897   Max.   :1.337e-02  
##                                                                       
##  transfer_station   clearings          cultivation           harvest       
##  Min.   :0        Min.   :0.0000000   Min.   :0.0000000   Min.   :0.00000  
##  1st Qu.:0        1st Qu.:0.0000000   1st Qu.:0.0000000   1st Qu.:0.00000  
##  Median :0        Median :0.0005278   Median :0.0000000   Median :0.00000  
##  Mean   :0        Mean   :0.0060419   Mean   :0.0009397   Mean   :0.01868  
##  3rd Qu.:0        3rd Qu.:0.0040539   3rd Qu.:0.0000000   3rd Qu.:0.01348  
##  Max.   :0        Max.   :0.4024400   Max.   :0.1253361   Max.   :0.83674  
##                                                                            
##    facilities           mines             railways   reclaimed       
##  Min.   :0.000000   Min.   :0.000000   Min.   :0   Min.   :0.000000  
##  1st Qu.:0.000000   1st Qu.:0.000000   1st Qu.:0   1st Qu.:0.000000  
##  Median :0.000000   Median :0.000000   Median :0   Median :0.000000  
##  Mean   :0.006653   Mean   :0.005448   Mean   :0   Mean   :0.001002  
##  3rd Qu.:0.002769   3rd Qu.:0.000000   3rd Qu.:0   3rd Qu.:0.000000  
##  Max.   :0.335753   Max.   :0.557884   Max.   :0   Max.   :0.078321  
##                                                                      
##   residential            roads          seismic_lines      transmission_lines
##  Min.   :0.0000000   Min.   :0.000000   Min.   :0.000000   Min.   :0.000000  
##  1st Qu.:0.0000000   1st Qu.:0.001040   1st Qu.:0.003046   1st Qu.:0.000000  
##  Median :0.0000000   Median :0.004019   Median :0.008356   Median :0.000000  
##  Mean   :0.0001473   Mean   :0.006218   Mean   :0.011034   Mean   :0.005597  
##  3rd Qu.:0.0000000   3rd Qu.:0.008650   3rd Qu.:0.012896   3rd Qu.:0.007232  
##  Max.   :0.0180541   Max.   :0.071829   Max.   :0.093433   Max.   :0.173909  
##                                                                              
##      trails            veg_edges            water               wells          
##  Min.   :0.000e+00   Min.   :0.000000   Min.   :0.000e+00   Min.   :0.0000000  
##  1st Qu.:9.465e-05   1st Qu.:0.001437   1st Qu.:0.000e+00   1st Qu.:0.0008692  
##  Median :7.187e-04   Median :0.006425   Median :0.000e+00   Median :0.0068416  
##  Mean   :1.516e-03   Mean   :0.011335   Mean   :1.254e-05   Mean   :0.0143883  
##  3rd Qu.:1.958e-03   3rd Qu.:0.015562   3rd Qu.:0.000e+00   3rd Qu.:0.0167246  
##  Max.   :3.864e-02   Max.   :0.147895   Max.   :7.896e-03   Max.   :0.3045854  
##                                                                                
##   lc_class110       lc_class120          lc_class20       lc_class210    
##  Min.   :0.00000   Min.   :0.0000000   Min.   :0.00000   Min.   :0.0000  
##  1st Qu.:0.01970   1st Qu.:0.0000000   1st Qu.:0.00000   1st Qu.:0.4607  
##  Median :0.03874   Median :0.0000000   Median :0.00000   Median :0.6749  
##  Mean   :0.04838   Mean   :0.0007554   Mean   :0.02741   Mean   :0.6324  
##  3rd Qu.:0.06218   3rd Qu.:0.0000000   3rd Qu.:0.03361   3rd Qu.:0.8364  
##  Max.   :0.73192   Max.   :0.1211446   Max.   :0.51965   Max.   :1.0000  
##                                                                          
##   lc_class220        lc_class230        lc_class32          lc_class33       
##  Min.   :0.000000   Min.   :0.00000   Min.   :0.000e+00   Min.   :0.0000000  
##  1st Qu.:0.002332   1st Qu.:0.01218   1st Qu.:0.000e+00   1st Qu.:0.0000000  
##  Median :0.044977   Median :0.03595   Median :0.000e+00   Median :0.0000000  
##  Mean   :0.113317   Mean   :0.06341   Mean   :1.748e-05   Mean   :0.0046114  
##  3rd Qu.:0.154669   3rd Qu.:0.08419   3rd Qu.:0.000e+00   3rd Qu.:0.0005702  
##  Max.   :0.971773   Max.   :0.72101   Max.   :1.175e-02   Max.   :0.3242328  
##                                                                              
##    lc_class34         lc_class50     
##  Min.   :0.000000   Min.   :0.00000  
##  1st Qu.:0.000000   1st Qu.:0.02385  
##  Median :0.004043   Median :0.05717  
##  Mean   :0.030311   Mean   :0.07933  
##  3rd Qu.:0.038149   3rd Qu.:0.11545  
##  Max.   :0.557178   Max.   :0.60824  
## 
# there are some NAs in the data which will cause problems with modeling/visualization of data ignore for now but will explore these sites specifically after report

covariates_grouped <- covariates_grouped %>% 
  
  # remove rows with NAs
  na.omit()

Let’s look at the histograms again and see if we need to remove any features or feature groups without enough data

# use for loop to plot histograms for all covariates

for (col in 5:ncol(covariates_grouped)) {
    hist(covariates_grouped[,col])
}

> IMO we don’t have enough variation in data to use the following features/feature groups

  • cfo
  • Cultivation
  • Reclaimed
  • Recreation
  • Reservoir
  • Residential
  • Water
  • lc_class120 (aka agriculture)
  • lc_class32 (aka rocks and rubble)

We also don’t have any data for following features since they don’t plot with the hist() function

  • Interchange ramp
  • Landfill
  • railways

So let’s modify this data and remove those features for now this step will need to be changed each year likely

covariates_grouped <- covariates_grouped %>% 
  
  # remove features we don't need
  select(!c(cfo,
            cultivation,
            reclaimed,
            recreation,
            residential,
            water,
            lc_class120,
            lc_class32,
            landfill,
            transfer_station,
            railways))

# check that it worked
names(covariates_grouped)
##  [1] "array"              "site"               "camera"            
##  [4] "buff_dist"          "borrowpits"         "pipeline"          
##  [7] "clearings"          "harvest"            "facilities"        
## [10] "mines"              "roads"              "seismic_lines"     
## [13] "transmission_lines" "trails"             "veg_edges"         
## [16] "wells"              "lc_class110"        "lc_class20"        
## [19] "lc_class210"        "lc_class220"        "lc_class230"       
## [22] "lc_class33"         "lc_class34"         "lc_class50"

Subset data & correlation plots

Marissa try to get the purrr code for this to work later

Now we need to subset the data for each buffer width, and then in the same loop let’s make correlation plots for these variables within each buffer

# Couldn't get this to work in purrr yet so using a loop to subset the data, create the plots, and save them all in one section... NEAT

buffer_frames <- list()

for (i in unique(covariates_grouped$buff_dist)){
  
  print(i)
  
  #Subset data based on radius
  df<-covariates_grouped %>%
    filter(buff_dist == i)
  
  #rename dataframe on the fly
  assign(paste("df", i, sep ="_"), df)
  
  #list of dataframes
  buffer_frames <-c (buffer_frames, list(df))
  
  #Subset data based on radius
  df<-covariates_grouped %>%
    filter(buff_dist == i) %>%
    select(where(is.numeric))
  
 #compute a correlation matrix (watch for errors)
 matrix <- cor(df)
 
 # print and save the correlation plot on the go
 # renaming for each buffer as we do
 png(file.path("figures/", paste("correlation_", i, ".png")))
 corrplot::corrplot(matrix,
                    type = 'upper',
                    tl.col = 'black',
                    title = paste0('Variable correlation plot at ', i))
 dev.off()
  
}
## [1] "250"
## [1] "500"
## [1] "750"
## [1] "1000"
## [1] "1250"
## [1] "1500"
## [1] "1750"
## [1] "2000"
## [1] "2250"
## [1] "2500"
## [1] "2750"
## [1] "3000"
## [1] "3250"
## [1] "3500"
## [1] "3750"
## [1] "4000"
## [1] "4250"
## [1] "4500"
## [1] "4750"
## [1] "5000"
# name list objects so we can extract names for plotting 

buffer_frames <- buffer_frames %>% 
  
  # absurdly long way to do this but for sake of time fuck it
  purrr::set_names('250 meter buffer',
                   '500 meter buffer',
                   '750 meter buffer',
                   '1000 meter buffer',
                   '1250 meter buffer',
                   '1500 meter buffer',
                   '1750 meter buffer',
                   '2000 meter buffer',
                   '2250 meter buffer',
                   '2500 meter buffer',
                   '2750 meter buffer',
                   '3000 meter buffer',
                   '3250 meter buffer',
                   '3500 meter buffer',
                   '3750 meter buffer',
                   '4000 meter buffer',
                   '4250 meter buffer',
                   '4500 meter buffer',
                   '4750 meter buffer',
                   '5000 meter buffer')

You will get a warning about standard deviation is zero for buffer 250

Exploratory plots

add more to this section in later when we have more time to explore the covariates and choose which should be inlcuded etc.

# use this code to change figure margins otherwise will not plot because figure margines are too large
par(mar=c(1,1,1,1))

# now use purrr to plot histograms for all remaining HFI variables for each buffer
hfi_histograms <- buffer_frames %>% 
  
  purrr::imap(
    ~.x %>% 
      
      # filter to just the HFI variables 
      select(where(is.numeric) &
          ! starts_with('lc_class')) %>% 
      
      # pipe into hist.data.frame function to make histograms for each variable
      hist.data.frame(mtitl = paste0('Histograms of HFI variables at ', .y)))

Now let’s do the same thing with the landcover variables

lc_histograms <- buffer_frames %>% 
  
  purrr::imap(
    ~.x %>% 
      
      # filter to just the landcover variables 
      select(where(is.numeric) &
          starts_with('lc_class')) %>% 
      
      # pipe into hist.data.frame function to make histograms for each variable
      hist.data.frame(mtitl = paste0('Histograms of landcover variables at ', .y)))

Add response metric

Now that we have the covariate data formatted we need to add the response metric (monthly proportional presence/absence) to the data frames

final_osm_2023_df <- buffer_frames %>% 
  
  purrr::map(
    ~.x %>% 
      
      left_join(prop_detections,
                by = 'site'))

Analysis

Now we are going to run a global model which includes all HFI and LC variables that at first glance (will do a more thorough check later) seem to have enough data to include as covariates for each buffer width, and then we will compare these models see which buffer width best fit the data for each species.

We don’t need to do ALL the species since many don’t have enough data.

Refer to the ACME_camera_script_9-2-2024.html or .Rmd the plot for proportional monthly detections should provide info on which species we have enough data for, can be found under Response metrics/3.Proportion monthly detections

A brief look at this fig indicates that we have enough for all the mammals in the prop_detections data frame except

  • cougar
  • wolverine
  • caribou??? (may have enough, may not)

Black bear

ANDREW/MARISSA FIX later??

there is probably a way to shorten the following code to select particular species, I saw Andrew’s for loop in the draft script he wrote but couldn’t quite figure out how to adapt it to my purposes with the data formatted the way I have it, so I did this instead, maybe we can merge approaches later to clean this up?

Global model

Marissa SCALE MODELS

actually don’t need to since all variables are proportions already right?

Let’s start with bears and use purrr to create a global model for every buffer distance

Recall purrr::map() is magical for iterations and will apply all the functions within the map() function to each item of the list supplied before the the map() function.

# create models for black bears at each buffer size
black_bear_mods <- final_osm_2023_df %>%

  # use purrr map to fun the same functions for all buffer sizes ((all objects in list))
  purrr::map(
    ~.x %>%

      # glmmTMB function let's us run the proportional binomial model using cbind to combine the present and absent columns for each species
     glmmTMB::glmmTMB(cbind(black_bear, absent_black_bear) ~
                        
                        # HFI covariates in alphabetical order
                        scale(borrowpits) +
                        scale(pipeline) +
                        scale(clearings) +
                        scale(harvest) +
                        scale(facilities) +
                        scale(mines) +
                        scale(roads) +
                        scale(seismic_lines) +
                        scale(transmission_lines) +
                        scale(trails) +
                         scale(veg_edges) +
                        scale(wells) +
                        
                        # VEG covariates in numerical order
                        scale(lc_class110) +
                        scale(lc_class20) +
                        scale(lc_class210) +
                        scale(lc_class220) +
                        scale(lc_class230) +
                        scale(lc_class33) +
                        scale(lc_class34) +
                        scale(lc_class50) +
                        
                        # Random effect of array
                        (1|array),
                   data = .,
                   family = 'binomial'))
## dropping columns from rank-deficient conditional model: scale(lc_class50)
## dropping columns from rank-deficient conditional model: scale(lc_class50)
## dropping columns from rank-deficient conditional model: scale(lc_class50)
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation

## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation

## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation

## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation

## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation

## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in finalizeTMB(TMBStruc, obj, fit, h, data.tmb.old): Model convergence
## problem; non-positive-definite Hessian matrix. See vignette('troubleshooting')
## Warning in finalizeTMB(TMBStruc, obj, fit, h, data.tmb.old): Model convergence
## problem; function evaluation limit reached without convergence (9). See
## vignette('troubleshooting'), help('diagnose')

Lots of notes about model convergence problems. Likely because there are too many variables in model and some without enough variation in data to reliably calculate and estime. We should revisit the data and thin it out a bit more

Model selection

We will use the model.sel() function from the MuMIn package to compare the global models for each buffer width and see which buffer fits the black bear data best

model.sel(black_bear_mods)
## Model selection table 
##                   cnd((Int)) dsp((Int)) cnd(scl(brr)) cnd(scl(clr))
## 250 meter buffer     -0.7159          +      0.144600       0.05057
## 500 meter buffer     -0.7088          +      0.046270       0.07333
## 750 meter buffer     -0.7007          +     -0.078630       0.12350
## 5000 meter buffer    -0.7029          +     -0.316500       0.29380
## 1500 meter buffer    -0.7027          +     -0.083810       0.08239
## 4000 meter buffer    -0.7006          +     -0.317800       0.35050
## 2000 meter buffer    -0.6985          +     -0.017770       0.09011
## 1250 meter buffer    -0.7008          +     -0.010970       0.07194
## 1750 meter buffer    -0.7007          +     -0.023210       0.09028
## 4250 meter buffer    -0.7007          +     -0.384700       0.45500
## 2250 meter buffer    -0.6964          +     -0.075840       0.10830
## 2500 meter buffer    -0.6954          +     -0.095330       0.13260
## 3750 meter buffer    -0.6988          +     -0.139300       0.24620
## 4750 meter buffer    -0.6995          +     -0.367500       0.26420
## 4500 meter buffer    -0.6984          +     -0.452500       0.29860
## 3500 meter buffer    -0.6973          +     -0.038220       0.21970
## 2750 meter buffer    -0.6945          +      0.007392       0.20070
## 3250 meter buffer    -0.6961          +      0.035470       0.22290
## 3000 meter buffer    -0.6951          +      0.017220       0.23380
## 1000 meter buffer    -0.7014          +     -0.111300       0.08834
##                   cnd(scl(fcl)) cnd(scl(hrv)) cnd(scl(lc_c11)) cnd(scl(lc_c20))
## 250 meter buffer       -0.20860     -0.003377        -0.119800         -0.11050
## 500 meter buffer       -0.16850      0.004147         0.045270         -0.05206
## 750 meter buffer       -0.08756      0.008714         0.009072         -0.03443
## 5000 meter buffer      -0.29170      0.019870         0.069660         -0.09645
## 1500 meter buffer      -0.04146      0.014790         0.034900          0.13690
## 4000 meter buffer      -0.19930      0.103800        -0.043040         -0.29760
## 2000 meter buffer      -0.12390      0.035990        -0.032260         -0.02103
## 1250 meter buffer      -0.04526      0.002422         0.298600          0.43880
## 1750 meter buffer      -0.09979      0.023800        -0.011810          0.03487
## 4250 meter buffer      -0.15980      0.103500        -0.080840         -0.31420
## 2250 meter buffer      -0.15580      0.042700        -0.187000         -0.32740
## 2500 meter buffer      -0.15390      0.057230        -0.229900         -0.40630
## 3750 meter buffer      -0.30110      0.101400        -0.120200         -0.52120
## 4750 meter buffer      -0.20040      0.048850         0.045380         -0.15230
## 4500 meter buffer      -0.10790      0.063170        -0.023030         -0.24550
## 3500 meter buffer      -0.25790      0.104300        -0.129000         -0.46140
## 2750 meter buffer      -0.16130      0.080510        -0.195200         -0.47250
## 3250 meter buffer      -0.18760      0.114400        -0.172000         -0.54520
## 3000 meter buffer      -0.17800      0.098270        -0.242800         -0.65260
## 1000 meter buffer      -0.07708      0.004195        -1.465000         -1.65100
##                   cnd(scl(lc_c21)) cnd(scl(lc_c22)) cnd(scl(lc_c23))
## 250 meter buffer          -0.50310          -0.2052         -0.28910
## 500 meter buffer          -0.16390           0.0941         -0.14570
## 750 meter buffer           0.05084           0.2703         -0.01426
## 5000 meter buffer         -0.39190          -0.1597         -0.12190
## 1500 meter buffer          0.72000           0.6888          0.19250
## 4000 meter buffer         -1.07600          -0.5130         -0.30210
## 2000 meter buffer          0.35340           0.4799          0.09316
## 1250 meter buffer          1.99200           1.5630          0.64190
## 1750 meter buffer          0.41150           0.4925          0.09489
## 4250 meter buffer         -1.19500          -0.6112         -0.33400
## 2250 meter buffer         -0.80330          -0.3123         -0.22610
## 2500 meter buffer         -1.16700          -0.5670         -0.31720
## 3750 meter buffer         -1.90100          -1.0690         -0.52640
## 4750 meter buffer         -0.53210          -0.2337         -0.15260
## 4500 meter buffer         -0.95240          -0.5066         -0.25360
## 3500 meter buffer         -1.56900          -0.8072         -0.44010
## 2750 meter buffer         -1.45800          -0.7270         -0.40590
## 3250 meter buffer         -1.81100          -0.9565         -0.51350
## 3000 meter buffer         -2.19300          -1.2150         -0.59440
## 1000 meter buffer         -6.58600          -4.1880         -2.37600
##                   cnd(scl(lc_c33)) cnd(scl(lc_c34)) cnd(scl(lc_c50))
## 250 meter buffer         -0.146000         -0.14140                 
## 500 meter buffer         -0.111200          0.02455                 
## 750 meter buffer         -0.118500          0.02624                 
## 5000 meter buffer         0.270500          0.19920         -0.08929
## 1500 meter buffer        -0.194200         -0.02358          0.17930
## 4000 meter buffer        -0.192700         -0.12990         -0.29330
## 2000 meter buffer        -0.172200         -0.08948          0.01919
## 1250 meter buffer        -0.036700          0.28090          0.59770
## 1750 meter buffer        -0.249700         -0.07943          0.04180
## 4250 meter buffer        -0.095470         -0.03329         -0.30550
## 2250 meter buffer        -0.291800         -0.29500         -0.29890
## 2500 meter buffer        -0.420100         -0.33370         -0.37780
## 3750 meter buffer        -0.410300         -0.31310         -0.54380
## 4750 meter buffer         0.115100          0.10580         -0.14860
## 4500 meter buffer         0.008272         -0.01478         -0.25560
## 3500 meter buffer        -0.356500         -0.25620         -0.47190
## 2750 meter buffer        -0.550100         -0.44890         -0.45930
## 3250 meter buffer        -0.424900         -0.34270         -0.56400
## 3000 meter buffer        -0.545500         -0.51390         -0.67120
## 1000 meter buffer        -0.701400         -1.49800         -2.19700
##                   cnd(scl(mns)) cnd(scl(ppl)) cnd(scl(rds)) cnd(scl(ssm_lns))
## 250 meter buffer        0.17730       0.06263     -0.350000          -0.06385
## 500 meter buffer        0.13490      -0.05545     -0.299700          -0.07976
## 750 meter buffer        0.14960       0.06058     -0.247300          -0.13570
## 5000 meter buffer      -0.22170      -0.53620     -0.412300          -0.06155
## 1500 meter buffer       0.19620      -0.12380      0.186600          -0.18790
## 4000 meter buffer       0.01314      -0.26400     -0.219700          -0.10140
## 2000 meter buffer       0.13640      -0.25320      0.344100          -0.16260
## 1250 meter buffer       0.14800      -0.10360     -0.001055          -0.18070
## 1750 meter buffer       0.22750      -0.19500      0.294000          -0.18730
## 4250 meter buffer      -0.11660      -0.18610     -0.293100          -0.09876
## 2250 meter buffer       0.15090      -0.19680      0.460700          -0.13430
## 2500 meter buffer       0.24740      -0.14860      0.486300          -0.12260
## 3750 meter buffer       0.21800      -0.19970     -0.022810          -0.08265
## 4750 meter buffer      -0.18370      -0.42350     -0.194400          -0.05816
## 4500 meter buffer      -0.17970      -0.27850     -0.136900          -0.07673
## 3500 meter buffer       0.16050      -0.23730      0.053710          -0.09458
## 2750 meter buffer       0.35790      -0.18170      0.396400          -0.13140
## 3250 meter buffer       0.15540      -0.25380      0.195100          -0.10090
## 3000 meter buffer       0.27180      -0.20410      0.299600          -0.11560
## 1000 meter buffer       0.08223      -0.05412     -0.015710          -0.17790
##                   cnd(scl(trl)) cnd(scl(trn_lns)) cnd(scl(veg_edg))
## 250 meter buffer       0.149700         -0.137100          0.026480
## 500 meter buffer       0.100600         -0.040860          0.023270
## 750 meter buffer       0.069740         -0.092940          0.012010
## 5000 meter buffer     -0.020190          0.347600          0.095810
## 1500 meter buffer      0.108200         -0.015840         -0.153800
## 4000 meter buffer      0.026240          0.100600          0.020290
## 2000 meter buffer      0.095460          0.041260         -0.204200
## 1250 meter buffer      0.121100         -0.022110         -0.005932
## 1750 meter buffer      0.090890          0.014230         -0.200600
## 4250 meter buffer     -0.007788          0.080290         -0.099930
## 2250 meter buffer      0.099100         -0.036780         -0.182300
## 2500 meter buffer      0.082090         -0.075010         -0.245800
## 3750 meter buffer      0.084780          0.039750          0.020450
## 4750 meter buffer      0.007153          0.227900          0.035720
## 4500 meter buffer      0.016340          0.129000         -0.055290
## 3500 meter buffer      0.086630          0.048130         -0.063410
## 2750 meter buffer      0.056150         -0.023110         -0.310900
## 3250 meter buffer      0.074220          0.040800         -0.210900
## 3000 meter buffer      0.055610         -0.000563         -0.253600
## 1000 meter buffer      0.124600         -0.040290          0.021150
##                   cnd(scl(wll)) df   logLik  AICc delta
## 250 meter buffer      -0.022900 21 -301.426 651.9  0.00
## 500 meter buffer       0.080920 21 -308.069 665.1 13.29
## 750 meter buffer       0.141400 21 -309.625 668.3 16.40
## 5000 meter buffer      0.415300 22 -309.255 670.2 18.38
## 1500 meter buffer      0.122000 22 -309.876 671.5 19.63
## 4000 meter buffer      0.312900 22 -309.988 671.7 19.85
## 2000 meter buffer      0.061450 22 -310.388 672.5 20.65
## 1250 meter buffer      0.064970 22 -310.566 672.9 21.01
## 1750 meter buffer      0.076950 22 -310.571 672.9 21.01
## 4250 meter buffer      0.336400 22 -310.643 673.0 21.16
## 2250 meter buffer      0.003626 22 -311.175 674.1 22.22
## 2500 meter buffer      0.013230 22 -311.256 674.2 22.39
## 3750 meter buffer      0.078870 22 -311.489 674.7 22.85
## 4750 meter buffer      0.330600 22 -311.726 675.2 23.32
## 4500 meter buffer      0.372500 22 -311.994 675.7 23.86
## 3500 meter buffer      0.026350 22 -312.336 676.4 24.55
## 2750 meter buffer      0.074580 22 -312.422 676.6 24.72
## 3250 meter buffer     -0.011380 22 -312.606 676.9 25.09
## 3000 meter buffer      0.029770 22 -312.794 677.3 25.46
## 1000 meter buffer      0.077150 22                     
## Models ranked by AICc(x) 
## Random terms (all models): 
##   cond(1 | array)

hmmmm seems fishy to me that the 250 meter buffer which is the only one that had missing data would perform THAT much better than all the others, and really you shouldn’t compare models if they aren’t run on the same data, hence the warning message

Let’s remove the 250 buffer and see what happens. To do this we can use the discard_at() function in the purr package which will remove a list element based on it’s name (e.g. ‘250 meter buffer’).

After that we will rerun the model selection again and see if it looks better

black_bear_mods_no250 <- black_bear_mods  %>% 
  
  # purrr::discard_at will remove an item from a list
  purrr::discard_at('250 meter buffer')


# run model selection again
model.sel(black_bear_mods_no250)
## Model selection table 
##                   cnd((Int)) dsp((Int)) cnd(scl(brr)) cnd(scl(clr))
## 500 meter buffer     -0.7088          +      0.046270       0.07333
## 750 meter buffer     -0.7007          +     -0.078630       0.12350
## 5000 meter buffer    -0.7029          +     -0.316500       0.29380
## 1500 meter buffer    -0.7027          +     -0.083810       0.08239
## 4000 meter buffer    -0.7006          +     -0.317800       0.35050
## 2000 meter buffer    -0.6985          +     -0.017770       0.09011
## 1250 meter buffer    -0.7008          +     -0.010970       0.07194
## 1750 meter buffer    -0.7007          +     -0.023210       0.09028
## 4250 meter buffer    -0.7007          +     -0.384700       0.45500
## 2250 meter buffer    -0.6964          +     -0.075840       0.10830
## 2500 meter buffer    -0.6954          +     -0.095330       0.13260
## 3750 meter buffer    -0.6988          +     -0.139300       0.24620
## 4750 meter buffer    -0.6995          +     -0.367500       0.26420
## 4500 meter buffer    -0.6984          +     -0.452500       0.29860
## 3500 meter buffer    -0.6973          +     -0.038220       0.21970
## 2750 meter buffer    -0.6945          +      0.007392       0.20070
## 3250 meter buffer    -0.6961          +      0.035470       0.22290
## 3000 meter buffer    -0.6951          +      0.017220       0.23380
## 1000 meter buffer    -0.7014          +     -0.111300       0.08834
##                   cnd(scl(fcl)) cnd(scl(hrv)) cnd(scl(lc_c11)) cnd(scl(lc_c20))
## 500 meter buffer       -0.16850      0.004147         0.045270         -0.05206
## 750 meter buffer       -0.08756      0.008714         0.009072         -0.03443
## 5000 meter buffer      -0.29170      0.019870         0.069660         -0.09645
## 1500 meter buffer      -0.04146      0.014790         0.034900          0.13690
## 4000 meter buffer      -0.19930      0.103800        -0.043040         -0.29760
## 2000 meter buffer      -0.12390      0.035990        -0.032260         -0.02103
## 1250 meter buffer      -0.04526      0.002422         0.298600          0.43880
## 1750 meter buffer      -0.09979      0.023800        -0.011810          0.03487
## 4250 meter buffer      -0.15980      0.103500        -0.080840         -0.31420
## 2250 meter buffer      -0.15580      0.042700        -0.187000         -0.32740
## 2500 meter buffer      -0.15390      0.057230        -0.229900         -0.40630
## 3750 meter buffer      -0.30110      0.101400        -0.120200         -0.52120
## 4750 meter buffer      -0.20040      0.048850         0.045380         -0.15230
## 4500 meter buffer      -0.10790      0.063170        -0.023030         -0.24550
## 3500 meter buffer      -0.25790      0.104300        -0.129000         -0.46140
## 2750 meter buffer      -0.16130      0.080510        -0.195200         -0.47250
## 3250 meter buffer      -0.18760      0.114400        -0.172000         -0.54520
## 3000 meter buffer      -0.17800      0.098270        -0.242800         -0.65260
## 1000 meter buffer      -0.07708      0.004195        -1.465000         -1.65100
##                   cnd(scl(lc_c21)) cnd(scl(lc_c22)) cnd(scl(lc_c23))
## 500 meter buffer          -0.16390           0.0941         -0.14570
## 750 meter buffer           0.05084           0.2703         -0.01426
## 5000 meter buffer         -0.39190          -0.1597         -0.12190
## 1500 meter buffer          0.72000           0.6888          0.19250
## 4000 meter buffer         -1.07600          -0.5130         -0.30210
## 2000 meter buffer          0.35340           0.4799          0.09316
## 1250 meter buffer          1.99200           1.5630          0.64190
## 1750 meter buffer          0.41150           0.4925          0.09489
## 4250 meter buffer         -1.19500          -0.6112         -0.33400
## 2250 meter buffer         -0.80330          -0.3123         -0.22610
## 2500 meter buffer         -1.16700          -0.5670         -0.31720
## 3750 meter buffer         -1.90100          -1.0690         -0.52640
## 4750 meter buffer         -0.53210          -0.2337         -0.15260
## 4500 meter buffer         -0.95240          -0.5066         -0.25360
## 3500 meter buffer         -1.56900          -0.8072         -0.44010
## 2750 meter buffer         -1.45800          -0.7270         -0.40590
## 3250 meter buffer         -1.81100          -0.9565         -0.51350
## 3000 meter buffer         -2.19300          -1.2150         -0.59440
## 1000 meter buffer         -6.58600          -4.1880         -2.37600
##                   cnd(scl(lc_c33)) cnd(scl(lc_c34)) cnd(scl(lc_c50))
## 500 meter buffer         -0.111200          0.02455                 
## 750 meter buffer         -0.118500          0.02624                 
## 5000 meter buffer         0.270500          0.19920         -0.08929
## 1500 meter buffer        -0.194200         -0.02358          0.17930
## 4000 meter buffer        -0.192700         -0.12990         -0.29330
## 2000 meter buffer        -0.172200         -0.08948          0.01919
## 1250 meter buffer        -0.036700          0.28090          0.59770
## 1750 meter buffer        -0.249700         -0.07943          0.04180
## 4250 meter buffer        -0.095470         -0.03329         -0.30550
## 2250 meter buffer        -0.291800         -0.29500         -0.29890
## 2500 meter buffer        -0.420100         -0.33370         -0.37780
## 3750 meter buffer        -0.410300         -0.31310         -0.54380
## 4750 meter buffer         0.115100          0.10580         -0.14860
## 4500 meter buffer         0.008272         -0.01478         -0.25560
## 3500 meter buffer        -0.356500         -0.25620         -0.47190
## 2750 meter buffer        -0.550100         -0.44890         -0.45930
## 3250 meter buffer        -0.424900         -0.34270         -0.56400
## 3000 meter buffer        -0.545500         -0.51390         -0.67120
## 1000 meter buffer        -0.701400         -1.49800         -2.19700
##                   cnd(scl(mns)) cnd(scl(ppl)) cnd(scl(rds)) cnd(scl(ssm_lns))
## 500 meter buffer        0.13490      -0.05545     -0.299700          -0.07976
## 750 meter buffer        0.14960       0.06058     -0.247300          -0.13570
## 5000 meter buffer      -0.22170      -0.53620     -0.412300          -0.06155
## 1500 meter buffer       0.19620      -0.12380      0.186600          -0.18790
## 4000 meter buffer       0.01314      -0.26400     -0.219700          -0.10140
## 2000 meter buffer       0.13640      -0.25320      0.344100          -0.16260
## 1250 meter buffer       0.14800      -0.10360     -0.001055          -0.18070
## 1750 meter buffer       0.22750      -0.19500      0.294000          -0.18730
## 4250 meter buffer      -0.11660      -0.18610     -0.293100          -0.09876
## 2250 meter buffer       0.15090      -0.19680      0.460700          -0.13430
## 2500 meter buffer       0.24740      -0.14860      0.486300          -0.12260
## 3750 meter buffer       0.21800      -0.19970     -0.022810          -0.08265
## 4750 meter buffer      -0.18370      -0.42350     -0.194400          -0.05816
## 4500 meter buffer      -0.17970      -0.27850     -0.136900          -0.07673
## 3500 meter buffer       0.16050      -0.23730      0.053710          -0.09458
## 2750 meter buffer       0.35790      -0.18170      0.396400          -0.13140
## 3250 meter buffer       0.15540      -0.25380      0.195100          -0.10090
## 3000 meter buffer       0.27180      -0.20410      0.299600          -0.11560
## 1000 meter buffer       0.08223      -0.05412     -0.015710          -0.17790
##                   cnd(scl(trl)) cnd(scl(trn_lns)) cnd(scl(veg_edg))
## 500 meter buffer       0.100600         -0.040860          0.023270
## 750 meter buffer       0.069740         -0.092940          0.012010
## 5000 meter buffer     -0.020190          0.347600          0.095810
## 1500 meter buffer      0.108200         -0.015840         -0.153800
## 4000 meter buffer      0.026240          0.100600          0.020290
## 2000 meter buffer      0.095460          0.041260         -0.204200
## 1250 meter buffer      0.121100         -0.022110         -0.005932
## 1750 meter buffer      0.090890          0.014230         -0.200600
## 4250 meter buffer     -0.007788          0.080290         -0.099930
## 2250 meter buffer      0.099100         -0.036780         -0.182300
## 2500 meter buffer      0.082090         -0.075010         -0.245800
## 3750 meter buffer      0.084780          0.039750          0.020450
## 4750 meter buffer      0.007153          0.227900          0.035720
## 4500 meter buffer      0.016340          0.129000         -0.055290
## 3500 meter buffer      0.086630          0.048130         -0.063410
## 2750 meter buffer      0.056150         -0.023110         -0.310900
## 3250 meter buffer      0.074220          0.040800         -0.210900
## 3000 meter buffer      0.055610         -0.000563         -0.253600
## 1000 meter buffer      0.124600         -0.040290          0.021150
##                   cnd(scl(wll)) df   logLik  AICc delta
## 500 meter buffer       0.080920 21 -308.069 665.1  0.00
## 750 meter buffer       0.141400 21 -309.625 668.3  3.11
## 5000 meter buffer      0.415300 22 -309.255 670.2  5.10
## 1500 meter buffer      0.122000 22 -309.876 671.5  6.34
## 4000 meter buffer      0.312900 22 -309.988 671.7  6.56
## 2000 meter buffer      0.061450 22 -310.388 672.5  7.36
## 1250 meter buffer      0.064970 22 -310.566 672.9  7.72
## 1750 meter buffer      0.076950 22 -310.571 672.9  7.73
## 4250 meter buffer      0.336400 22 -310.643 673.0  7.87
## 2250 meter buffer      0.003626 22 -311.175 674.1  8.94
## 2500 meter buffer      0.013230 22 -311.256 674.2  9.10
## 3750 meter buffer      0.078870 22 -311.489 674.7  9.56
## 4750 meter buffer      0.330600 22 -311.726 675.2 10.04
## 4500 meter buffer      0.372500 22 -311.994 675.7 10.57
## 3500 meter buffer      0.026350 22 -312.336 676.4 11.26
## 2750 meter buffer      0.074580 22 -312.422 676.6 11.43
## 3250 meter buffer     -0.011380 22 -312.606 676.9 11.80
## 3000 meter buffer      0.029770 22 -312.794 677.3 12.17
## 1000 meter buffer      0.077150 22                     
## Models ranked by AICc(x) 
## Random terms (all models): 
##   cond(1 | array)

this looks much more realistic; the 500 m buffer is top model for black bears

So what we will do for each species is remove the 250 meter buffer for now since there are some data missing, and compare just the other buffer sizes that contain the full data set

Model summary 500m

Let’s take a look at the model summary for the top model

summary(black_bear_mods_no250$`500 meter buffer`)
##  Family: binomial  ( logit )
## Formula:          
## cbind(black_bear, absent_black_bear) ~ scale(borrowpits) + scale(pipeline) +  
##     scale(clearings) + scale(harvest) + scale(facilities) + scale(mines) +  
##     scale(roads) + scale(seismic_lines) + scale(transmission_lines) +  
##     scale(trails) + scale(veg_edges) + scale(wells) + scale(lc_class110) +  
##     scale(lc_class20) + scale(lc_class210) + scale(lc_class220) +  
##     scale(lc_class230) + scale(lc_class33) + scale(lc_class34) +  
##     scale(lc_class50) + (1 | array)
## Data: .
## 
##      AIC      BIC   logLik deviance df.resid 
##    658.1    721.9   -308.1    616.1      133 
## 
## Random effects:
## 
## Conditional model:
##  Groups Name        Variance Std.Dev.
##  array  (Intercept) 0.01327  0.1152  
## Number of obs: 154, groups:  array, 4
## 
## Conditional model:
##                            Estimate Std. Error z value Pr(>|z|)    
## (Intercept)               -0.708829   0.081608  -8.686   <2e-16 ***
## scale(borrowpits)          0.046269   0.064984   0.712   0.4765    
## scale(pipeline)           -0.055450   0.105537  -0.525   0.5993    
## scale(clearings)           0.073326   0.069247   1.059   0.2896    
## scale(harvest)             0.004147   0.059027   0.070   0.9440    
## scale(facilities)         -0.168463   0.079322  -2.124   0.0337 *  
## scale(mines)               0.134867   0.078233   1.724   0.0847 .  
## scale(roads)              -0.299655   0.160307  -1.869   0.0616 .  
## scale(seismic_lines)      -0.079761   0.071753  -1.112   0.2663    
## scale(transmission_lines) -0.040863   0.099056  -0.413   0.6800    
## scale(trails)              0.100629   0.060398   1.666   0.0957 .  
## scale(veg_edges)           0.023267   0.128452   0.181   0.8563    
## scale(wells)               0.080916   0.077363   1.046   0.2956    
## scale(lc_class110)         0.045271   0.079573   0.569   0.5694    
## scale(lc_class20)         -0.052057   0.062257  -0.836   0.4031    
## scale(lc_class210)        -0.163897   0.165605  -0.990   0.3223    
## scale(lc_class220)         0.094103   0.125506   0.750   0.4534    
## scale(lc_class230)        -0.145678   0.088846  -1.640   0.1011    
## scale(lc_class33)         -0.111250   0.095697  -1.163   0.2450    
## scale(lc_class34)          0.024545   0.112412   0.218   0.8272    
## scale(lc_class50)                NA         NA      NA       NA    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Let’s repeat this process for each species that we have enough data for.

Caribou

We may or may not have enough data for caribou but let’s try it at least for this preliminary report

We can use the same code from black bears (above) to run global models for each buffer width except remember we want to remove 250 meters

And in the same chunk to save time let’s also run the model.sel() function

Models/Model selection

caribou_mods_no250 <- final_osm_2023_df %>%
  
  # remove 350 meter buffer width
  purrr::discard_at('250 meter buffer') %>% 
  
  # use purrr map to make global models for all other buffer sizes
  purrr::map(
    ~.x %>%

     glmmTMB::glmmTMB(cbind(caribou, absent_caribou) ~
                        # HFI covariates in alphabetical order
                        scale(borrowpits) +
                        scale(pipeline) +
                        scale(clearings) +
                        scale(harvest) +
                        scale(facilities) +
                        scale(mines) +
                        scale(roads) +
                        scale(seismic_lines) +
                        scale(transmission_lines) +
                        scale(trails) +
                         scale(veg_edges) +
                        scale(wells) +
                        
                        # VEG covariates in numerical order
                        scale(lc_class110) +
                        scale(lc_class20) +
                        scale(lc_class210) +
                        scale(lc_class220) +
                        scale(lc_class230) +
                        scale(lc_class33) +
                        scale(lc_class34) +
                        scale(lc_class50) +
                        (1|array),
                   data = .,
                   family = 'binomial'))
## dropping columns from rank-deficient conditional model: scale(lc_class50)
## Warning in finalizeTMB(TMBStruc, obj, fit, h, data.tmb.old): Model convergence
## problem; singular convergence (7). See vignette('troubleshooting'),
## help('diagnose')
## dropping columns from rank-deficient conditional model: scale(lc_class50)
## Warning in finalizeTMB(TMBStruc, obj, fit, h, data.tmb.old): Model convergence
## problem; singular convergence (7). See vignette('troubleshooting'),
## help('diagnose')

## Warning in finalizeTMB(TMBStruc, obj, fit, h, data.tmb.old): Model convergence
## problem; singular convergence (7). See vignette('troubleshooting'),
## help('diagnose')

## Warning in finalizeTMB(TMBStruc, obj, fit, h, data.tmb.old): Model convergence
## problem; singular convergence (7). See vignette('troubleshooting'),
## help('diagnose')
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation

## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation

## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation

## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation

## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in finalizeTMB(TMBStruc, obj, fit, h, data.tmb.old): Model convergence
## problem; non-positive-definite Hessian matrix. See vignette('troubleshooting')
## Warning in finalizeTMB(TMBStruc, obj, fit, h, data.tmb.old): Model convergence
## problem; function evaluation limit reached without convergence (9). See
## vignette('troubleshooting'), help('diagnose')
## Warning in finalizeTMB(TMBStruc, obj, fit, h, data.tmb.old): Model convergence
## problem; singular convergence (7). See vignette('troubleshooting'),
## help('diagnose')

## Warning in finalizeTMB(TMBStruc, obj, fit, h, data.tmb.old): Model convergence
## problem; singular convergence (7). See vignette('troubleshooting'),
## help('diagnose')
## Warning in finalizeTMB(TMBStruc, obj, fit, h, data.tmb.old): Model convergence
## problem; non-positive-definite Hessian matrix. See vignette('troubleshooting')
## Warning in finalizeTMB(TMBStruc, obj, fit, h, data.tmb.old): Model convergence
## problem; singular convergence (7). See vignette('troubleshooting'),
## help('diagnose')
## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation

## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation

## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation

## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation

## Warning in (function (start, objective, gradient = NULL, hessian = NULL, :
## NA/NaN function evaluation
## Warning in finalizeTMB(TMBStruc, obj, fit, h, data.tmb.old): Model convergence
## problem; non-positive-definite Hessian matrix. See vignette('troubleshooting')
## Warning in finalizeTMB(TMBStruc, obj, fit, h, data.tmb.old): Model convergence
## problem; function evaluation limit reached without convergence (9). See
## vignette('troubleshooting'), help('diagnose')
## Warning in finalizeTMB(TMBStruc, obj, fit, h, data.tmb.old): Model convergence
## problem; non-positive-definite Hessian matrix. See vignette('troubleshooting')

## Warning in finalizeTMB(TMBStruc, obj, fit, h, data.tmb.old): Model convergence
## problem; non-positive-definite Hessian matrix. See vignette('troubleshooting')

## Warning in finalizeTMB(TMBStruc, obj, fit, h, data.tmb.old): Model convergence
## problem; non-positive-definite Hessian matrix. See vignette('troubleshooting')
## Warning in finalizeTMB(TMBStruc, obj, fit, h, data.tmb.old): failed to invert
## Hessian from numDeriv::jacobian(), falling back to internal vcov estimate
## Warning in finalizeTMB(TMBStruc, obj, fit, h, data.tmb.old): Model convergence
## problem; singular convergence (7). See vignette('troubleshooting'),
## help('diagnose')

## Warning in finalizeTMB(TMBStruc, obj, fit, h, data.tmb.old): Model convergence
## problem; singular convergence (7). See vignette('troubleshooting'),
## help('diagnose')

## Warning in finalizeTMB(TMBStruc, obj, fit, h, data.tmb.old): Model convergence
## problem; singular convergence (7). See vignette('troubleshooting'),
## help('diagnose')
# model selection
model.sel(caribou_mods_no250)
## Warning in sqrt(diag(vcovs)): NaNs produced
## Warning in sqrt(diag(vcovs)): NaNs produced

## Warning in sqrt(diag(vcovs)): NaNs produced
## Model selection table 
##                   cnd((Int)) dsp((Int)) cnd(scl(brr)) cnd(scl(clr))
## 3500 meter buffer     -39.97          +       0.52630       -4.4770
## 3250 meter buffer     -48.20          +       0.57850       -8.3020
## 4000 meter buffer     -69.82          +      -1.03200      -12.8700
## 4250 meter buffer    -102.70          +      -0.65960      -13.6100
## 4500 meter buffer    -154.50          +       0.19870      -29.1700
## 5000 meter buffer    -141.00          +      -0.24470      -16.6200
## 4750 meter buffer    -145.70          +       0.46280      -18.5600
## 2000 meter buffer    -126.60          +       0.24330       -0.5072
## 1750 meter buffer     -45.54          +       0.28030       -0.3973
## 1000 meter buffer    -211.40          +      -0.73770       -0.7295
## 1250 meter buffer     -83.44          +      -0.14220       -0.5565
## 750 meter buffer      -57.73          +      -0.36300       -3.9610
## 500 meter buffer     -115.60          +      -0.08530       -2.3090
## 1500 meter buffer     -54.48          +      -0.01644       -0.6089
## 2250 meter buffer     -98.05          +       0.21490       -1.3040
## 2500 meter buffer   -3726.00          +      -0.61180       -6.3820
## 2750 meter buffer    -113.90          +      -0.60290       -9.5490
## 3000 meter buffer     -87.85          +      -0.27950       -8.3510
## 3750 meter buffer     -76.00          +      -0.43520       -8.9520
##                   cnd(scl(fcl)) cnd(scl(hrv)) cnd(scl(lc_c11)) cnd(scl(lc_c20))
## 3500 meter buffer       22.0800       1.30800          0.44670          0.19240
## 3250 meter buffer       20.1600       1.07000          0.28720          1.05800
## 4000 meter buffer        5.6650       1.64700          0.17720         -1.37200
## 4250 meter buffer        7.3940       0.92010          0.39600         -2.04500
## 4500 meter buffer       16.4800      -1.20000          2.91800         -5.65800
## 5000 meter buffer       15.0100      -3.23600          2.01700         -3.41500
## 4750 meter buffer       11.9500      -1.09400          2.46100         -4.47300
## 2000 meter buffer        1.7760       0.40280         -0.47450          1.62600
## 1750 meter buffer        0.5079       0.39490         -0.30720          0.89110
## 1000 meter buffer       -3.0040      -0.04756          0.09140          0.21280
## 1250 meter buffer       -1.6010       0.22650         -0.18520          0.23050
## 750 meter buffer        -2.0090      -0.28870         -0.08214          0.30160
## 500 meter buffer        -1.3450      -0.84310         -0.07359          0.13350
## 1500 meter buffer       -0.6869       0.11540          0.01538          0.87390
## 2250 meter buffer        4.4380       0.57960         -0.56600          1.62500
## 2500 meter buffer       32.6600       0.44650         -0.99940          0.20600
## 2750 meter buffer       25.2700       0.23450          0.12350          2.31500
## 3000 meter buffer       22.6500       0.19820          0.35610          2.46000
## 3750 meter buffer        4.3230       2.20200          0.12080         -0.04679
##                   cnd(scl(lc_c21)) cnd(scl(lc_c22)) cnd(scl(lc_c23))
## 3500 meter buffer           2.4830          -1.6570          -1.2630
## 3250 meter buffer           2.5700          -3.3320          -0.1300
## 4000 meter buffer           0.5373          -0.5936          -2.6740
## 4250 meter buffer           1.0390          -1.5890          -2.0330
## 4500 meter buffer           0.5238          -3.5620          -4.2500
## 5000 meter buffer           0.6232          -4.1620          -2.8020
## 4750 meter buffer           0.5036          -3.3130          -3.5540
## 2000 meter buffer           1.9510          -4.4060          -0.3944
## 1750 meter buffer           2.0940          -4.4780          -0.0707
## 1000 meter buffer           0.7628          -1.5410          -0.1259
## 1250 meter buffer           1.2090          -2.3180          -0.1875
## 750 meter buffer            0.2055          -1.7860          -0.1655
## 500 meter buffer            0.4276          -1.8010           0.1041
## 1500 meter buffer           2.2980          -5.0590           0.1476
## 2250 meter buffer           2.2590          -1.9130          -0.4929
## 2500 meter buffer          -4.6990          -9.0930          -2.7470
## 2750 meter buffer           3.0990          -3.8600          -0.4273
## 3000 meter buffer           3.6310          -5.0790           0.2340
## 3750 meter buffer           0.8903          -1.5050          -1.9120
##                   cnd(scl(lc_c33)) cnd(scl(lc_c34)) cnd(scl(lc_c50))
## 3500 meter buffer         -42.9500           7.8150           1.4610
## 3250 meter buffer         -40.7800          10.2400           1.3140
## 4000 meter buffer          -3.8620           4.1570           0.8342
## 4250 meter buffer          -3.8650           3.9150           0.8337
## 4500 meter buffer          -1.0740          20.2100          -1.9180
## 5000 meter buffer           1.1400          14.3100          -0.8480
## 4750 meter buffer          -0.5216          16.4600          -1.6960
## 2000 meter buffer         -11.0900           4.9720           0.9729
## 1750 meter buffer          -6.1090           4.0100           0.8697
## 1000 meter buffer           1.5910          -0.3428           0.5619
## 1250 meter buffer          -0.3738           0.5782           0.6583
## 750 meter buffer            5.1470          -0.5521                 
## 500 meter buffer            3.8460          -0.4170                 
## 1500 meter buffer          -2.5440           2.8520           0.8777
## 2250 meter buffer         -15.3700           6.5530           1.2280
## 2500 meter buffer         -76.4600          12.1600          -0.7792
## 2750 meter buffer         -54.3900          11.5200           1.4620
## 3000 meter buffer         -49.8600          10.5600           1.4030
## 3750 meter buffer          -6.5370           3.2410           0.8534
##                   cnd(scl(mns)) cnd(scl(ppl)) cnd(scl(rds)) cnd(scl(ssm_lns))
## 3500 meter buffer       9.38500       4.25900     -26.65000          -1.82200
## 3250 meter buffer      -8.20700       3.65400     -25.15000          -1.02600
## 4000 meter buffer      14.20000       2.66000      -5.81900           0.19520
## 4250 meter buffer      12.56000       3.41500      -6.95800          -1.41400
## 4500 meter buffer     -11.01000       0.84190     -18.73000           3.74900
## 5000 meter buffer      -8.46600       1.68300     -10.02000           1.51900
## 4750 meter buffer     -10.33000       0.02599     -11.79000           3.69300
## 2000 meter buffer       4.25000       1.21100      -7.85400           2.52500
## 1750 meter buffer       2.91100       1.32200      -6.86500           1.83000
## 1000 meter buffer      -0.54750       1.50800      -0.82740           0.09458
## 1250 meter buffer      -0.01695       1.47900      -3.08400           0.96880
## 750 meter buffer       -2.65800       0.91230       0.05739           0.71450
## 500 meter buffer      -93.67000       0.16430       0.83100           0.21080
## 1500 meter buffer       0.45280       0.92940      -3.49600           1.28900
## 2250 meter buffer      -2.44500       1.26400     -11.09000           3.07400
## 2500 meter buffer     -71.22000       0.58240     -14.94000           3.06300
## 2750 meter buffer     -50.05000       0.51380     -12.93000           2.29900
## 3000 meter buffer     -33.78000       0.24540     -11.94000           0.90010
## 3750 meter buffer      16.06000       2.56300      -5.19400           0.58190
##                   cnd(scl(trl)) cnd(scl(trn_lns)) cnd(scl(veg_edg))
## 3500 meter buffer      -4.45900            -37.92            7.7930
## 3250 meter buffer      -4.10100            -44.73            7.7910
## 4000 meter buffer      -1.88400            -86.86           -1.9750
## 4250 meter buffer       0.41700           -130.30           -2.4580
## 4500 meter buffer       3.65800           -192.90           10.1000
## 5000 meter buffer       5.05700           -174.90           -5.7510
## 4750 meter buffer       2.83400           -181.80            3.7210
## 2000 meter buffer      -1.27900           -209.30            2.2460
## 1750 meter buffer      -1.05200            -67.55            2.0690
## 1000 meter buffer      -0.10380           -419.70            0.5945
## 1250 meter buffer      -0.02513           -150.50            2.3820
## 750 meter buffer       -0.16810           -107.70            2.0860
## 500 meter buffer       -0.11810           -230.60            1.6490
## 1500 meter buffer       0.08393            -87.14            0.4934
## 2250 meter buffer      -2.36600           -150.50            3.3040
## 2500 meter buffer      -3.42100          -6087.00           -1.7490
## 2750 meter buffer      -2.61600           -137.50           -0.8197
## 3000 meter buffer      -2.34400            -98.61           -0.2538
## 3750 meter buffer      -3.35900            -99.57           -0.3062
##                   cnd(scl(wll)) df  logLik  AICc delta
## 3500 meter buffer       5.33100 22 -56.529 164.8  0.00
## 3250 meter buffer       6.19900 22 -57.095 165.9  1.13
## 4000 meter buffer      -2.97800 22 -58.827 169.4  4.60
## 4250 meter buffer      -2.19700 22 -59.286 170.3  5.51
## 4500 meter buffer       1.34300 22 -59.734 171.2  6.41
## 5000 meter buffer       5.31300 22 -62.553 176.8 12.05
## 4750 meter buffer       0.23400 22 -62.984 177.7 12.91
## 2000 meter buffer       0.02485 22 -69.321 190.4 25.58
## 1750 meter buffer       0.26890 22 -72.133 196.0 31.21
## 1000 meter buffer       0.70440 22 -80.257 212.2 47.46
## 1250 meter buffer      -0.09405 22 -83.151 218.0 53.24
## 750 meter buffer        0.64660 21 -84.979 219.0 54.17
## 500 meter buffer        0.25930 21 -92.716 234.4 69.65
## 1500 meter buffer       0.22370 22                    
## 2250 meter buffer       1.13000 22                    
## 2500 meter buffer       5.26100 22                    
## 2750 meter buffer       5.09100 22                    
## 3000 meter buffer       5.64400 22                    
## 3750 meter buffer      -4.14300 22                    
## Models ranked by AICc(x) 
## Random terms (all models): 
##   cond(1 | array)

We get a warning that there are some model convergence problems, I expect this is because we don’t have enough data for caribou but I don’t have time to dig into this now so we will investigate more closely for final analysis

For caribou 3500m buffer is top model for now

Model summary 3500m

Let’s take a closer look at the top model summary

summary(caribou_mods_no250$`3500 meter buffer`)
##  Family: binomial  ( logit )
## Formula:          
## cbind(caribou, absent_caribou) ~ scale(borrowpits) + scale(pipeline) +  
##     scale(clearings) + scale(harvest) + scale(facilities) + scale(mines) +  
##     scale(roads) + scale(seismic_lines) + scale(transmission_lines) +  
##     scale(trails) + scale(veg_edges) + scale(wells) + scale(lc_class110) +  
##     scale(lc_class20) + scale(lc_class210) + scale(lc_class220) +  
##     scale(lc_class230) + scale(lc_class33) + scale(lc_class34) +  
##     scale(lc_class50) + (1 | array)
## Data: .
## 
##      AIC      BIC   logLik deviance df.resid 
##    157.1    223.9    -56.5    113.1      132 
## 
## Random effects:
## 
## Conditional model:
##  Groups Name        Variance  Std.Dev. 
##  array  (Intercept) 3.968e-14 1.992e-07
## Number of obs: 154, groups:  array, 4
## 
## Conditional model:
##                             Estimate Std. Error z value Pr(>|z|)  
## (Intercept)               -3.997e+01  2.620e+03  -0.015   0.9878  
## scale(borrowpits)          5.263e-01  8.465e-01   0.622   0.5341  
## scale(pipeline)            4.259e+00  2.163e+00   1.969   0.0489 *
## scale(clearings)          -4.477e+00  4.411e+00  -1.015   0.3101  
## scale(harvest)             1.308e+00  1.313e+00   0.996   0.3191  
## scale(facilities)          2.208e+01  1.044e+01   2.115   0.0344 *
## scale(mines)               9.385e+00  1.866e+01   0.503   0.6150  
## scale(roads)              -2.665e+01  1.188e+01  -2.243   0.0249 *
## scale(seismic_lines)      -1.822e+00  2.566e+00  -0.710   0.4778  
## scale(transmission_lines) -3.792e+01  3.924e+03  -0.010   0.9923  
## scale(trails)             -4.459e+00  2.262e+00  -1.971   0.0487 *
## scale(veg_edges)           7.793e+00  5.957e+00   1.308   0.1908  
## scale(wells)               5.331e+00  6.442e+00   0.828   0.4080  
## scale(lc_class110)         4.467e-01  1.340e+04   0.000   1.0000  
## scale(lc_class20)          1.924e-01  2.481e+04   0.000   1.0000  
## scale(lc_class210)         2.482e+00  1.155e+05   0.000   1.0000  
## scale(lc_class220)        -1.657e+00  7.536e+04   0.000   1.0000  
## scale(lc_class230)        -1.263e+00  3.057e+04   0.000   1.0000  
## scale(lc_class33)         -4.295e+01  8.310e+03  -0.005   0.9959  
## scale(lc_class34)          7.815e+00  1.992e+04   0.000   0.9997  
## scale(lc_class50)          1.461e+00  3.356e+04   0.000   1.0000  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

There’s nothing that catches my eye immediately as being sus about this particular model so it may not have been one with convergence issues. We will keep it in report for now

Coyote

Models/Model selection

coyote_mods_no250 <- final_osm_2023_df %>%
  
  # remove 350 meter buffer width
  purrr::discard_at('250 meter buffer') %>% 
  
  # use purrr map to make global models for all other buffer sizes
  purrr::map(
    ~.x %>%

     glmmTMB::glmmTMB(cbind(coyote, absent_coyote) ~
                        seismic_lines +
                        pipeline +
                        borrowpits +
                        wellsites +
                        roads +
                        trails + 
                        lc_class20 +
                        lc_class34 +
                        lc_class50 +
                        lc_class110 +
                        lc_class210 +
                        lc_class220 +
                        lc_class230 +
                        (1|array),
                   data = .,
                   family = 'binomial'))

# model selection
model.sel(coyote_mods_no250)

for coyote top model appears to be 4500 m by quite a bit

Model summary 4500m

Let’s get the model summary for this model

summary(coyote_mods_no250$`4500 meter buffer`)

There is one lc class with a very high estimate and SE which seems a bit sus to me

Fisher

Models/Model selection

fisher_mods_no250 <- final_osm_2023_df %>%
  
  # remove 350 meter buffer width
  purrr::discard_at('250 meter buffer') %>% 
  
  # use purrr map to make global models for all other buffer sizes
  purrr::map(
    ~.x %>%

     glmmTMB::glmmTMB(cbind(fisher, absent_fisher) ~
                        seismic_lines +
                        pipeline +
                        borrowpits +
                        wellsites +
                        roads +
                        trails + 
                        lc_class20 +
                        lc_class34 +
                        lc_class50 +
                        lc_class110 +
                        lc_class210 +
                        lc_class220 +
                        lc_class230 +
                        (1|array),
                   data = .,
                   family = 'binomial'))

# model selection
model.sel(fisher_mods_no250)

For fisher top model is 2000 meter

Model summary 2000m

Let’s print the summary for this model

summary(fisher_mods_no250$`2000 meter buffer`)

Again lc_class34 has a very high standard error, we may not have enough data in this landcover class to use in the final analysis

Grey wolf

Models/Model selection

wolf_mods_no250 <- final_osm_2023_df %>%
  
  # remove 350 meter buffer width
  purrr::discard_at('250 meter buffer') %>% 
  
  # use purrr map to make global models for all other buffer sizes
  purrr::map(
    ~.x %>%

     glmmTMB::glmmTMB(cbind(grey_wolf, absent_grey_wolf) ~
                        seismic_lines +
                        pipeline +
                        borrowpits +
                        wellsites +
                        roads +
                        trails + 
                        lc_class20 +
                        lc_class34 +
                        lc_class50 +
                        lc_class110 +
                        lc_class210 +
                        lc_class220 +
                        lc_class230 +
                        (1|array),
                   data = .,
                   family = 'binomial'))

# model selection
model.sel(wolf_mods_no250)

For grey wolf top model is 4500 m buffer

Model summary 4500m

Let’s get the model summary for this model

summary(wolf_mods_no250$`4500 meter buffer`)

lc_class34 still presenting some issues, interesting that seismic lines weren’t significant and have a negative estimate

Lynx

Models/Model selection

lynx_mods_no250 <- final_osm_2023_df %>%
  
  # remove 350 meter buffer width
  purrr::discard_at('250 meter buffer') %>% 
  
  # use purrr map to make global models for all other buffer sizes
  purrr::map(
    ~.x %>%

     glmmTMB::glmmTMB(cbind(lynx, absent_lynx) ~
                        seismic_lines +
                        pipeline +
                        borrowpits +
                        wellsites +
                        roads +
                        trails + 
                        lc_class20 +
                        lc_class34 +
                        lc_class50 +
                        lc_class110 +
                        lc_class210 +
                        lc_class220 +
                        lc_class230 +
                        (1|array),
                   data = .,
                   family = 'binomial'))

# model selection
model.sel(lynx_mods_no250)

For lynx the top model is the 1000 m buffer

Model summary 1000m

Let’s get the model summary

summary(lynx_mods_no250$`1000 meter buffer`)

Moose

Models/Model selection

moose_mods_no250 <- final_osm_2023_df %>%
  
  # remove 350 meter buffer width
  purrr::discard_at('250 meter buffer') %>% 
  
  # use purrr map to make global models for all other buffer sizes
  purrr::map(
    ~.x %>%

     glmmTMB::glmmTMB(cbind(moose, absent_moose) ~
                        seismic_lines +
                        pipeline +
                        borrowpits +
                        wellsites +
                        roads +
                        trails + 
                        lc_class20 +
                        lc_class34 +
                        lc_class50 +
                        lc_class110 +
                        lc_class210 +
                        lc_class220 +
                        lc_class230 +
                        (1|array),
                   data = .,
                   family = 'binomial'))

# model selection
model.sel(moose_mods_no250)

For moose the top model is the 3750 m buffer

Model summary 3750m

Let’s get the model summary

summary(moose_mods_no250$`3750 meter buffer`)

Red fox

Models/Model selection

fox_mods_no250 <- final_osm_2023_df %>%
  
  # remove 350 meter buffer width
  purrr::discard_at('250 meter buffer') %>% 
  
  # use purrr map to make global models for all other buffer sizes
  purrr::map(
    ~.x %>%

     glmmTMB::glmmTMB(cbind(red_fox, absent_red_fox) ~
                        seismic_lines +
                        pipeline +
                        borrowpits +
                        wellsites +
                        roads +
                        trails + 
                        lc_class20 +
                        lc_class34 +
                        lc_class50 +
                        lc_class110 +
                        lc_class210 +
                        lc_class220 +
                        lc_class230 +
                        (1|array),
                   data = .,
                   family = 'binomial'))

# model selection
model.sel(fox_mods_no250)

For red fox the top model is 3750 m buffer

Model summary 3750m

Let’s get the model summary

summary(fox_mods_no250$`3750 meter buffer`)

Gagh! Borrow pits does not have a reasonable estimate and SE

White-tailed deer

Models/Model selection

deer_mods_no250 <- final_osm_2023_df %>%
  
  # remove 350 meter buffer width
  purrr::discard_at('250 meter buffer') %>% 
  
  # use purrr map to make global models for all other buffer sizes
  purrr::map(
    ~.x %>%

      # have to include the `` around the white-tailed_deer or R won't recognize it as a variable because of the -
     glmmTMB::glmmTMB(cbind(`white-tailed_deer`, `absent_white-tailed_deer`) ~
                        seismic_lines +
                        pipeline +
                        borrowpits +
                        wellsites +
                        roads +
                        trails + 
                        lc_class20 +
                        lc_class34 +
                        lc_class50 +
                        lc_class110 +
                        lc_class210 +
                        lc_class220 +
                        lc_class230 +
                        (1|array),
                   data = .,
                   family = 'binomial'))

# model selection
model.sel(deer_mods_no250)

For deer the top model was also the 3750 buffer

Model summary 3750m

Let’s get the model summary

summary(deer_mods_no250$`3750 meter buffer`)

Coefficient plots

What we need to do now is extract the coefficient estimates from each top model, as well as the 95% CI so we can plot them for easier visualization and understanding of the data

Extract coefficients

The confint() function will extract the coefficients and CI intervals from a model, so what we need to do is make a list of all the models, then use the map() function in purrr to apply the confint() function to all the models and get the coefficients. We want this to result in a tibble that has a column for the HFI feature (we aren’t plotting the lc_class data for this report), the upper and lower CI, and the coefficient estimate.

In order to do this we have to do a bit of data wrangling, currently this isn’t the most pleasing way to accomplish the desired outcome, but it works.

# This is also a dog shit way to do this but I need to get this done

# make a list of the top models for each species
top_models <- list(black_bear_mods_no250$`500 meter buffer`,
                   caribou_mods_no250$`1250 meter buffer`,
                   coyote_mods_no250$`4500 meter buffer`,
                   fisher_mods_no250$`2000 meter buffer`,
                   wolf_mods_no250$`4500 meter buffer`,
                   lynx_mods_no250$`1000 meter buffer`,
                   moose_mods_no250$`3750 meter buffer`,
                   fox_mods_no250$`3750 meter buffer`,
                   deer_mods_no250$`3750 meter buffer`) %>%  
  
  # pipe into purrr to create coefficient table for all models
  purrr::map(
    ~.x %>% 
      
      # extract the coefficients and upper and lower CI
      confint() %>% 
 
      # format resulting object as a tibble data frame
      as_tibble() %>%
      
      # subset to just the HFI variables for these plots
      slice_head(n = 6) %>% 
      
      # add a column where we can put the feature names
      rowid_to_column() %>% 
      
      # rename the columns for plotting
      rename('lower' = `2.5 %`,
             'upper' = `97.5 %`,
             'estimate' = Estimate,
             'feature' = rowid) %>% 
      
      # rename the entries to features, need to look at the order the features are in from the model summary and ensure it matches
      mutate(feature = as.factor(feature),
             feature = recode(feature,
                              '1' = 'seismic_lines',
                              '2' = 'pipeline',
                              '3' = 'borrowpits',
                              '4' = 'wellsites',
                              '5' = 'roads',
                              '6' = 'trails'))) %>% 
  
  # set the names of each resulting tibble data frame to the species name
  purrr::set_names('Black bear',
                   'Caribou',
                   'Coyote',
                   'Fisher',
                   'Grey wolf',
                   'Lynx',
                   'Moose',
                   'Red fox',
                   'White-tailed deer')

Merge data frames

Now we have a data frame with the bet coefficients for each species, but if we want these on a plot together we need them all in one data frame.

To merge data into one data frame we can use the list_rbind() function from the purrr package which will take each element of the list and stack them on top of one another just like rbind does with data frames, and if we use the names_to argument we can extract the names of the list elements and assign them to a column so we know which data comes from which species model (list element)

In this code I also add a new column called uuid which contains the image id (uuid) for a phylopic silhouette of each species that I may want to use for plotting

Phylopic.or is an open source online database of silhouettes various contributors have created for use. There is an R package that works with this data called rphylopic; we can use the get_uuid() function from this package to extract the data for a silhouette for each species we want, which is what I’ve done here.

# combine all list elements 
coeffs_df_all <- list_rbind(top_models,
                            names_to = 'species') %>% 
  
  # change species to a factor for plotting
  mutate(species = as.factor(species),
         
         # add phylopic uuid for each species for plotting 
         # the uuid is extracted using getuuid with the species name as name = ''
         uuid = case_when(species == 'Black bear' ~ get_uuid(name = 'Ursus americanus'),
                          species == 'Caribou' ~ get_uuid(name = 'Rangifer tarandus'),
                          species == 'Coyote' ~ get_uuid(name = 'Canis latrans'),
                          species == 'Fisher' ~ get_uuid(name = 'Pekania pennanti'),
                          species == 'Grey wolf' ~ get_uuid(name = 'Canis lupus'),
                          species == 'Lynx' ~ get_uuid(name = 'Lynx lynx'),
                          species == 'Moose' ~ get_uuid(name = 'Alces alces'),
                          species == 'Red fox' ~ get_uuid(name = 'Vulpes vulpes'),
                          species == 'White-taield deer' ~ get_uuid(name ='Odocoileus virginianus'))) 

Plot coefficients

Now let’s explore some different options to plot the coefficients

All species

Let’s try plotting all the species on one plot using ggplot()

# provide data and mapping aesthetics
ggplot(coeffs_df_all, aes(x = feature, 
                          y = estimate, 
                          group = uuid)) +
  
   geom_errorbar(aes(ymin = lower, 
                     ymax = upper, 
                     color = feature),
                width = 0.4,
                linewidth = 0.5,
                position = position_dodge(width = 0.9)) +
  
    # add points for each estimate for each covariate and use position = position_dodge to shift the points so all the species don't plot on top of one another
  geom_phylopic(aes(x = feature, 
                 y = estimate,
                 uuid = uuid),
             position = position_dodge(width = 0.9),
             size = 2)
# combine all list elements 
coeffs_df_all <- list_rbind(top_models,
                            names_to = 'species') %>% 
  
  # change species to a factor for plotting
  mutate(species = as.factor(species),
         
         # add phylopic uuid for each species for plotting 
         # the uuid is extracted using getuuid with the species name as name = ''
         uuid = case_when(species == 'Black bear' ~ get_uuid(name = 'Ursus americanus'),
                          species == 'Caribou' ~ get_uuid(name = 'Rangifer tarandus'),
                          species == 'Coyote' ~ get_uuid(name = 'Canis latrans'),
                          species == 'Fisher' ~ '735066c6-2f3e-4f97-acb1-06f55ae075c9',
                          species == 'Grey wolf' ~ get_uuid(name = 'Canis lupus'),
                          species == 'Lynx' ~ get_uuid(name = 'Lynx lynx'),
                          species == 'Moose' ~ '74eab34a-498c-4614-aece-f02361874f79',
                          species == 'Red fox' ~ '9c977769-bf1e-44d4-82ab-f9f93dce39ca',
                          species == 'White-taield deer' ~ '56f6fdb2-15d0-43b5-b13f-714f2cb0f5d0')) %>% 
  
  # need to remove problematic estimate which is going to skew plot since its so large compared to others
  filter(!c(species == 'Red fox' &
           feature == 'wellsites'))

After plotting the moose image I don’t like it, let’s manually replace it in the data

# I went on the phylopic website and saw there are three images for moose, I like the last one better so we will use it

get_uuid(name = 'Alces alces',
         n = 3)

get_uuid(name = 'Odocoileus virginianus',
         n = 3)

get_uuid(name = 'Pekania pennanti',
         n = 2)

get_uuid(name = 'Vulpes',
         n = 2)


# Then I manually copied this uuid and replaces it in the code above

Try plotting all

ggplot(coeffs_df_all, aes(x = feature, 
                          y = estimate, 
                          group = uuid)) +
  
   geom_errorbar(aes(ymin = lower, 
                     ymax = upper, 
                     color = feature),
                width = 0.4,
                linewidth = 0.5,
                position = position_dodge(width = 1.2)) +
  
    # add points for each estimate for each covariate and use position = position_dodge to shift the points so all the species don't plot on top of one another
  geom_phylopic(aes(x = feature, 
                 y = estimate,
                 uuid = uuid),
             position = position_dodge(width = 1.2),
             size = 2)
ggplot(coeffs, aes(x = feature, y = estimate)) +
  
  geom_point(size = 3, position = position_dodge(width = 0.5)) +
  
  geom_errorbar(aes(ymin = lower, ymax = upper),
                width = 0.4,
                linewidth = 1,
                position = position_dodge(width = 0.5)) +
  
  geom_hline(yintercept = 0, linetype = "dashed")+
  scale_color_manual(values = c("#56B4E9", "#009E73"), name = "Spatial Scale")+
  theme_classic()+
  ggtitle("Moose Response to Anthropogenic Disturbance Features")+
  ylab("Coefficient Estimate \n \u00B1 95% CI")+
  scale_x_discrete(labels =c("Borrowpits", "Harvest\nAreas", "Industrial\nSites", "Pipelines","Roads", "Seismic\nLines", "Trails", "Transmission\nLines"))+
    theme(axis.title.x = element_blank(),
        axis.text.x =  element_text(size = 12),
        axis.title.y = element_text(size = 14),
        legend.title = element_text(size = 12),
        plot.title = element_text(size = 15, hjust = 0.5))

If all else fails can use plot_model function